May 13th, 2014
Emdros version 3.4.0 has been released. This is the first public release in almost three years. Development has been very active, however, and several of you have been receiving updates by simply requesting them.
You can find the source code and binaries via this link:
This release also sees a new package, emdros-example-2.0. It contains a complete sample database with sample queries and configuration files for the Emdros Query Tool and the Emdros Chunking Tool. It contains the complete King James Version of the Bible, with parse trees and part of speech information. Please be sure to download this separate download, if you need to see what Emdros can do.
The focus-areas of this release include:
- Support for iOS and Android. This is at the library level only — no apps are bundled, but instructions for making them yourself are included.
- A smoother user experience for the GUI applications
- Enhancements to the MQL query language
- Documentation improvements
- Build improvements on all platforms
- Speed improvements
The Release Notes can be seen here:
January 29th, 2013
For a number of years, I have had a proprietary extension to Emdros which makes an attempt at encrypting SQLite 3 databases that are used with Emdros. The encryption isn’t strong, and is meant as an attempt at fooling “most people”, not including professional cryptographers. Thus it is primarily meant as a way to copy-protect (DRM-enable) content that content-creators may wish to protect from their customers. Until today, it has been using a key with length > 56 keys.
Today I successfully added a key scheduling algorithm which allowed the cipher to function with only a 56 bit key. This is significant, since, according to my research, it allows export from the US without restriction. Thus Apple and others can happily ship Emdros-driven apps from their app stores (e.g., iOS and/or Mac app store) without requiring a license to export from the US.
If you are interested in licensing this encryption, please get in touch via the email address mentioned on http://emdros.org/contact.html.
January 1st, 2013
Emdros uses, internally, a data structure known as a “Skip List”. It is basically a glorified linked list, with randomization applied to make it look somewhat like a balanced binary tree. It’s efficient both space-wise and time-wise for a range of problems, though it often cannot beat a really good implementation of a red-black tree, for example.
Skip lists were invented by William Pugh, and described in a paper from the early 1990’ies. In the paper, Pugh described the various options for the randomization to be applied to the data structure. Based on Pugh’s recommendations, I originally chose a particular kind of randomization, but failed to experiment with different kinds. Until today, that is.
I found that by simply tweaking the number of bits to consume from the random number, as well as raising the number of elements catered for in the data structure, Emdros as a whole could be made to run consistently about 5-6% faster across my various test suites.
That’s a lot of speed increase in exchange for three lines of different code.
Incidentally, while running the tests, I found that the BPT engine (my proprietary backend engine) is still at least 30% faster than the SQLite 3 backend for the same database content and the same set of queries.
February 17th, 2012
I’ve successfully made the files requisite for building a .deb on Debian/Ubuntu/other-Debian-derived-Linux-distros.
Interested parties are welcome to contact me for the sources.
January 3rd, 2012
The Emdros blog is kindly hosted by the J. Alan Groves Center for Advanced Biblical Research. The Groves Center suffered a hardware outage in late 2011, bringing this blog down.
Thanks to the hard work of Dr. Kirk Lowery, the blog is now back. Thanks, Kirk!
More news coming. Stay tuned!
July 4th, 2011
I have released Emdros version 3.3.0 over at SourceForge.Net.
Please note that the implementation and method of indexing of the Full Text Search are subject to change, as this feature is still experimental.
February 14th, 2011
I have just finished adding a new feature to the topographic part of the MQL query language.
Hitherto, the only relation one could specify for containment between an inner object block and the outer container was “part_of”, and it was always relative to the containing substrate.
In plain English, that meant that the inner object’s monad set had to be a subset of the outer object’s monad set, or (if the inner block was at the outermost level), it must be a subset given in the IN clause after SELECT ALL OBJECTS.
Now, you can specify these four relations:
- part_of(substrate) // The default
- part_of(universe) // To disregard gaps in the substrate
The overlap relation means: The inner object must have a non-empty intersection (i.e., share at least one monad with) the outer substrate or universe.
This makes it possible to specify things like this:
SELECT ALL OBJECTS
IN Aramaic_monads // Pre-defined monad set
// This means that we want all clauses which share at least one monad
// with the Aramaic_monads monad set
// This finds all phrases inside the left and right boundaries of
// the outer clause, regardless of any gaps in the clause.
This will appear in the next public release after 3.2.0.
If anyone is interested in trying this out, please let me know.
October 30th, 2010
I’ve finished the implementation, tuning, and testing of Full Text Search (FTS) for Emdros.
The implementation is part of the libharvest library, and is written in C++ like the rest of Emdros.
I implemented the basic idea in Python first, then reimplemented it in C++. Python is so malleable that this sort of prototyping work makes Python ideal for the task.
The Full Text Search has a lot of features, including:
- Index “documents”, which must exist as object types.
- Index documents based on “indexed object types” (e.g., token) and one indexed feature of the indexed object type.
- Search within “documents”.
- Chainable filters that modify token strings before being indexed, e.g., to weed out stop-words, or to strip, lower-case, or otherwise alter the token strings.
- Tokenization of query-string splitting on spaces.
- Optional application of the chainable filters to the query-terms after tokenization, so as to be more likely to match the indexed feature.
- Google-like “quoted strings” that make the query-terms be adjacent.
- More than one “quoted string” allowed in the query-string.
- Return results as list of three-tuples (document-first-monad, document-last-monad, first-search-term-first-monad)
- Return results as customizable snippets of real tokens, with optional highlighting of query terms.
- Command-line tools for both indexing and searching.
This will appear in the next public release of Emdros.
Interested parties should contact me via email for getting the latest sources.