Expose public api
Created by: Zach DeCook
lexconvert is a great command line script; it should also expose a public api, so that other python programs can use it directly.
Possible usage (assuming I have no syntax errors)
import lexconvert
espeakPhones = lexconvert.phones2phones('unicode-ipa',to='espeak',u"ˌaɪpˌiˈeɪ")
assert espeakPhones == lexconvert.phones2phones('unicode-ipa',u"ˌaɪpˌiˈeɪ")['espeak']
# I imagine there's also a pythony way to read to and write from a file stream
# (or yield output as it's needed)
# That may also be a good thing to support.
"Public" functions should be clearly documented and must be expected to stay stable though minor version bumps.
This would also help with #3 (closed) since it's simpler to unit test functions than mocking command line arguments and capturing stdout.
Imported comments:
By Silas S. Brown on 2019-10-03T10:36:26.627Z
This also depends on making the version numbers make more sense than they do now (perhaps SemVer?) Originally I started at something like 0.1 and generally incremented whichever decimal place felt appropriate to what I'd just done, so now we're at 0.27. But if we're going to start a stable API we need to define what counts as "major" versus "minor" version bumps. Perhaps we should start with a Python 3 version called 1.0.0 and go from there?
You can already import lexconvert
and do lexconvert.convert(phonemes,from,to)
(I've done this in throwaway scripts), so we might want to make that official.
Related though is this whole "byte strings versus Unicode strings" issue that's going to make the migration to Python 3 a headache. I think the current status is that some functions could take either without trouble, while others assume one or the other.
Another annoyance is going to be the way some of the code looks at os.environ
for extra configuration options (like KANA_TYPE
and DTALK_COMMAND_CODE
and SPEAKJET_BINARY
and BRAILLE_UNICODE
, plus rather too many in the BBC Micro option). Since some of this environment-consulting happens at module-load time, it cannot (in its current form) be properly tested without reloading the module. It's basically a consequence of the way LexFormats
works: I tried to give it enough general options for most formats, but there were some pesky little exceptions that just didn't seem to fit in very well and I ended up saying "oh, just check os.environ
for it". There's probably a better way of doing that.