Projects / MPCA


MPCA is a comprehensive suite of tools for doing discrete principal components analysis on data sets of size 100Mb or more. Scaling is done using sparse vectors, multi-threading, memory mapping, and other POSIX tricks. Reports, file dumping utilities, and other utilities are included. The general problem of discrete components analysis is variously called grade of membership, PLSA, non-neg.matrix factorization, multinomial admixtures, LDA, and multinomial PCA.

Operating Systems

Recent releases

  •  20 Feb 2007 22:33

    Release Notes: Some linkages to the ALVIS system at allow the software to be used to create topic models and annotate linguistically tagged content. Some cleanups with the linkBags Perl utilities have been moved out to CPAN. To see some of the models in action, visit the search demos at

    •  22 Oct 2006 12:31

      Release Notes: A bug in mpbags that made it constantly use the CPU was fixed. There is no need to update from 1.54 if you don't use mpbags.

      •  18 Oct 2006 14:41

        Release Notes: This release focuses on integration with the tool-suite at Some useful new scripts include the linkBags, linkTables, and linkMpca set for running MPCA on Web link data augmented with names and title text.

        •  23 Jun 2006 13:38

          Release Notes: This release adds a significant bugfix to gibbsk sampling and new capabilities to ALVIS support (still incomplete and undocumented). Users should upgrade to this release.

          •  05 Sep 2005 08:33

            Release Notes: This release compiles under Cygwin. Many other minor updates and moderate bugfixes were made.


            Project Spotlight


            A Fluent OpenStack client API for Java.


            Project Spotlight

            TurnKey TWiki Appliance

            A TWiki appliance that is easy to use and lightweight.