have made very good progress with the ExtJS based UI and have attached
a screen shot showing the document browsing page, which completely
runs off the REST-ful JSON API, ie. there are no dynamic web pages on
the server-side.
However, while starting to work with JSON and in browser processing of
results I've started to notice some bugs that occur when dealing with
documents containing none ASCII characters, ie. Unicode. Because of
the variable length of utf-8 characters, some of the document offsets
demonstrate a skidding effect when the previous characters are
non-ASCII. I always knew this would be a feature that needed to be
implemented, but had forgotten that Javascript has native Unicode
strings. For the in-browser hashing that is required for the anonymous
search feature of the Churnalism browser extension to function
correctly, the client side hashes have to exactly match the output of
the server-side hashing. Thus, I need to handle Unicode server-side at
this point.
Currently, just trying to make a sensible decision between which
Unicode library to use:
http://utfcpp.sourceforge.net/
or
http://icu-project.org/apiref/icu4c/classUnicodeString.html
This is definitely a worthwhile thing to do, as it does open up SFM to
global usage :)
Bit unsure whether to push the current version to Github before or
after implementing this feature?
Cheers,
Donny.
the other considerations of pushing to the master branch are that the
other tabs in the interface are not formatted using ExtJS and the
browser extension is not yet coded to deal with the JSON search
results. It would be great to get some feedback so I am keen to push,
but other users might be confused by the unfinished presentation of
the other tabs.
If I do push, I can deploy to the demo instance to help you evaluate
the new interface.
In hindsight, I probably should have branched this piece of work, but
that might be a bit complex to fix now...
Cheers,
Donny.
managed to fix the Unicode skid bug, so will push this evening. Still
a slight issue that long unicode sections get split because of
erroneous whitespace detection, but the vast majority of fragments are
correct.
Will email off-list the details of the test instance.
Cheers,
Donny.