Jeffrey A. Bilmes: Software and Data
- The graphical models toolkit
Software for windows.
PhiPAC automatically tuning matrix-matrix multiply library (and
the first auto-tuning matrix multiply dense linear algebra library).
propagation code and web page, a graph Laplacian Manifold
approximation based semi-supervised learning algorithm that uses an
objective function based on KL-divergence (code also includes
fast C++ parallel implementation).
The Buried Markov Model (BMM)
code (includes mixtures of sparse linear conditional multi-time
Gaussian models). Sorry, no documentation.
Multi-party meeting scheduling with simple
preference aggregation rules.
Extensions to the old Berkeley parallel make software (or what is
known as pmake). The original pmake utility is
We have made a number of significant extensions to pmake including a full
gnu autoconf configuration, many new resource constraints (including dynamic), and other
features (as well as removed some old ones that were no longer
needed). The complete source code is at at
pmake-3.0-alpha. Note that this
is an alpha release, and is basically working but there are no plans
for additional work to be done on this (at least by me or my group), nor can I answer
any further questions about this (see the source code).
- Corpus definitions and baseline systems
for both the SVitchboard-II and FiSVer-I datasets
can be found at
link. The paper describing it is
, a set of difficult to segment images (with elongated or narrow
structures, and contrast gradients) along with ground truth labellings,
and that were used in the following
A small amount of hand-aligned
French/English data, useful for statistical machine translation systems,
done by Karim Filali.
multi-channel real-world in-situ noisy speech corpus.
(now available for download).
The Semi-Supervised Switchboard Transcription (S3TP) project
In the 1990s, the
switchboard transcription project gave us 1.5 hours of
frame-by-frame phonetically transcribed switchboard conversational
speech data. Here, we have used a modern semi-supervised learning
algorithm to phonetically label at the frame level the remaining 250
hours of SWB-I, and we call this the semi-supervised switchboard
transcription project (or s3tp). The data and algorithms