Hadoop Studio is a map-reduce development environment (IDE) based on Netbeans. It makes it easy to create, understand, and debug map-reduce applications based on Hadoop, without requiring development-time access to a map-reduce cluster. The studio provides a real-time workflow view of a map-reduce job, which displays the individual inputs, outputs, and interactions between the phases of a map-reduce job. The workflow view of a job updates in real time with the developer's code changes. It then generates Java sources and compiles them into a binary jar file, which can be run on a normal Hadoop cluster.
Jug is a task-based parallelism framework. Jug allows you to write code that is broken up into tasks and run different tasks on different processors. It uses the filesystem to communicate between processes and works correctly over NFS, so you can coordinate processes on different machines. Jug is a pure Python implementation and should work on any platform that can run Python.
Plasma implements the map/reduce framework on a compute cluster. It has its own distributed filesystem, PlasmaFS, which is transactional (ACID), reliable, and fast, and which provides a complete set of file operations. PlasmaFS can be accessed via an RPC protocol or via NFS (i.e., it is mountable). Additionally, there is a key/value database on top of PlasmaFS.
MapReduce-BitDew is an implementation of the MapReduce programming model proposed by Google for Internet Desktop Grids. Using MapReduce-BitDew, you can execute MapReduce applications on resources like Desktop PCs distributed on the Internet. MapReduce-BitDew features a firewall-friendly protocol, fault-tolerance, result-certification, 2-level schedulers, and more.
dispy is a Python framework for parallel execution of computations by distributing them across multiple processors in a single machine (SMP), or among many machines in a cluster or grid. The computations can be standalone programs or Python functions. dispy is well suited for the data parallel (SIMD) paradigm where a computation is evaluated with different (large) datasets independently (similar to Hadoop, MapReduce, Parallel Python). dispy features include automatic distribution of dependencies (files, Python functions, classes, modules), client-side and server-side fault recovery, scheduling of computations to specific nodes, encryption for security, sharing of computation resources if desired, and more.