From ed7137ac5cd9791c4ac31716661b3b33399aa468 Mon Sep 17 00:00:00 2001 From: "Silas S. Brown" <ssb22@cam.ac.uk> Date: Fri, 28 Jun 2019 08:16:02 +0100 Subject: [PATCH] Update README.md, Web Adjuster --- README.md | 63 ++++++++++++++++++++++++++++------------------------- adjuster.py | 6 ++--- 2 files changed, 36 insertions(+), 33 deletions(-) diff --git a/README.md b/README.md index 867acc8..b57868e 100644 --- a/README.md +++ b/README.md @@ -34,10 +34,6 @@ Then run adjuster.py with the appropriate options (see below), or use it in a WS Web Adjuster is free software licensed under the Apache License, Version 2.0 (this is also the license used by Tornado itself). If you use it in a good project, I’d appreciate hearing about it. -If you need to cite a peer-reviewed paper: - -Silas S. Brown. Web Annotation with Modiï¬ed-Yarowsky and Other Algorithms. Overload 112 (December 2012) pp.4-7. - Annotator Generator =================== @@ -51,11 +47,11 @@ Annotator Generator is an examples-driven generator of fast text annotators. “ * It can output the annotations alone or it can combine them with the original text using HTML Ruby markup or simple braces -* If anything is unclear (didn’t happen in the examples, or there’s not enough context to ï¬gure out which example should be applied) then the program will leave it unannotated so you can pass it to a backup annotation program if you have one. +* If anything is unclear (didn’t happen in the examples, or there’s not enough context to figure out which example should be applied) then the program will leave it unannotated so you can pass it to a backup annotation program if you have one. -* If you have no backup annotator then try setting the `-y` option, which makes Annotator Generator try harder to ï¬nd context-independent rules with context-dependent exceptions, so as to annotate as much text as possible. +* If you have no backup annotator then try setting the `-y` option, which makes Annotator Generator try harder to find context-independent rules with context-dependent exceptions, so as to annotate as much text as possible. -* Generated annotators can act as ï¬lters for Web Adjuster; options are also provided for generating client-side annotators for Android and iOS, and a clipboard annotator for Windows and Windows Mobile, or you could format the annotations on a Unix terminal +* Generated annotators can act as filters for Web Adjuster; options are also provided for generating client-side annotators for Android and iOS, and a clipboard annotator for Windows and Windows Mobile, or you could format the annotations on a Unix terminal Legal considerations -------------------- @@ -74,6 +70,13 @@ Legally obtaining that original annotated corpus is up to you. _If you are in th Laws outside the UK are different (and I’m not a lawyer) so check carefully. But if the website’s terms don’t actually prohibit writing an unpublished scraper for non-commercial mining purposes, perhaps you won’t need a legal exception—but you should still respect their bandwidth and do it slowly, both for moral reasons (it’s the right thing to do) and pragmatic ones (you won’t want their sysadmins and service providers taking action against you). +Citation +-------- + +If you need to cite a peer-reviewed paper: + +Silas S. Brown. Web Annotation with Modified-Yarowsky and Other Algorithms. Overload 112 (December 2012) pp.4-7. + TermLayout ========== @@ -85,11 +88,11 @@ TermLayout is a text-mode HTML formatter for Unix terminals which supports: * Wide characters (uses locale settings from LC_CTYPE, LANG etc) -* Smaller terminal sizes. In some cases a table will still end up being wider than the terminal and not easily reflowable; if that happens then at least each cell should ï¬t. But in many cases TermLayout can arrange for no horizontal scrolling to be necessary. +* Smaller terminal sizes. In some cases a table will still end up being wider than the terminal and not easily reflowable; if that happens then at least each cell should fit. But in many cases TermLayout can arrange for no horizontal scrolling to be necessary. Unrecognised markup is left in the output for inspection. -TermLayout is _not_ a Web browser: it has no facilities for navigating links. It is meant only for formatting text on a terminal using HTML markup. I wrote it when I wanted to page through a document with Ruby markup in fbterm but couldn’t ï¬nd a text-mode browser that would format this markup correctly. +TermLayout is _not_ a Web browser: it has no facilities for navigating links. It is meant only for formatting text on a terminal using HTML markup. I wrote it when I wanted to page through a document with Ruby markup in fbterm but couldn’t find a text-mode browser that would format this markup correctly. If you are using TermLayout with an annotator generated by Annotator Generator, you might also be interested in `tmux-annotator.sh` which sets up tmux with a “hotkey†to annotate the current screen and display the result in TermLayout. @@ -100,7 +103,7 @@ General options --------------- `--config` -: Name of the configuration file to read, if any. The process's working directory will be set to that of the configuration file so that relative pathnames can be used inside it. Any option that would otherwise have to be set on the command line may be placed in this file as an `option="value"` or `option='value'` line (without any double-hyphen prefix). Multi-line values are possible if you quote them in `"""`...`"""`, and you can use standard \ escapes. You can also set config= in the configuration file itself to import another configuration file (for example if you have per-machine settings and global settings). If you want there to be a default configuration file without having to set it on the command line every time, an alternative option is to set the ADJUSTER_CFG environment variable. +: Name of the configuration file to read, if any. The process's working directory will be set to that of the configuration file so that relative pathnames can be used inside it. Any option that would otherwise have to be set on the command line may be placed in this file as an option="value" or option='value' line (without any double-hyphen prefix). Multi-line values are possible if you quote them in """...""", and you can use standard \ escapes. You can also set config= in the configuration file itself to import another configuration file (for example if you have per-machine settings and global settings). If you want there to be a default configuration file without having to set it on the command line every time, an alternative option is to set the ADJUSTER_CFG environment variable. `--version` : Just print program version and exit @@ -109,7 +112,7 @@ Network listening and security settings --------------------------------------- `--port` (default 28080) -: The port to listen on. Setting this to 80 will make it the main Web server on the machine (which will likely require root access on Unix); setting it to 0 disables request-processing entirely (for if you want to use only the Dynamic DNS and watchdog options); setting it to -1 selects a local port in the ephemeral port range, in which case address and port will be written in plain form to standard output if it's not a terminal and `--`background is set (see also `--`just-me). +: The port to listen on. Setting this to 80 will make it the main Web server on the machine (which will likely require root access on Unix); setting it to 0 disables request-processing entirely (for if you want to use only the Dynamic DNS and watchdog options); setting it to -1 selects a local port in the ephemeral port range, in which case address and port will be written in plain form to standard output if it's not a terminal and --background is set (see also --just-me). `--publicPort` (default 0) : The port to advertise in URLs etc, if different from 'port' (the default of 0 means no difference). Used for example if a firewall prevents direct access to our port but some other server has been configured to forward incoming connections. @@ -118,7 +121,7 @@ Network listening and security settings : The user name to run as, instead of root. This is for Unix machines where port is less than 1024 (e.g. port=80)—you can run as root to open the privileged port, and then drop privileges. Not needed if you are running as an ordinary user. `--address` -: The address to listen on. If unset, will listen on all IP addresses of the machine. You could for example set this to localhost if you want only connections from the local machine to be received, which might be useful in conjunction with `--`real_proxy. +: The address to listen on. If unset, will listen on all IP addresses of the machine. You could for example set this to localhost if you want only connections from the local machine to be received, which might be useful in conjunction with --real_proxy. `--password` : The password. If this is set, nobody can connect without specifying ?p= followed by this password. It will then be sent to them as a cookie so they don't have to enter it every time. Notes: (1) If wildcard_dns is False and you have multiple domains in host_suffix, then the password cookie will have to be set on a per-domain basis. (2) On a shared server you probably don't want to specify this on the command line where it can be seen by process-viewing tools; use a configuration file instead. @@ -133,7 +136,7 @@ Network listening and security settings : Whether or not to allow running with no password. Off by default as a safeguard against accidentally starting an open proxy. `--prohibit` (default wiki.*action=edit) -: Comma-separated list of regular expressions specifying URLs that are not allowed to be fetched unless `--`real_proxy is in effect. Browsers requesting a URL that contains any of these will be redirected to the original site. Use for example if you want people to go direct when posting their own content to a particular site (this is of only limited use if your server also offers access to any other site on the Web, but it might be useful when that's not the case). Include ^https in the list to prevent Web Adjuster from fetching HTTPS pages for adjustment and return over normal HTTP. This access is enabled by default now that many sites use HTTPS for public pages that don't really need to be secure, just to get better placement on some search engines, but if sending confidential information to the site then beware you are trusting the Web Adjuster machine and your connection to it, plus its certificate verification might not be as thorough as your browser's. +: Comma-separated list of regular expressions specifying URLs that are not allowed to be fetched unless --real_proxy is in effect. Browsers requesting a URL that contains any of these will be redirected to the original site. Use for example if you want people to go direct when posting their own content to a particular site (this is of only limited use if your server also offers access to any other site on the Web, but it might be useful when that's not the case). Include ^https in the list to prevent Web Adjuster from fetching HTTPS pages for adjustment and return over normal HTTP. This access is enabled by default now that many sites use HTTPS for public pages that don't really need to be secure, just to get better placement on some search engines, but if sending confidential information to the site then beware you are trusting the Web Adjuster machine and your connection to it, plus its certificate verification might not be as thorough as your browser's. `--prohibitUA` (default TwitterBot) : Comma-separated list of regular expressions which, if they occur in browser strings, result in the browser being redirected to the original site. Use for example if you want certain robots that ignore robots.txt to go direct. @@ -151,16 +154,16 @@ Network listening and security settings : Whether or not to pass on requests for /robots.txt. If this is False then all robots will be asked not to crawl the site; if True then the original site's robots settings will be mirrored. The default of False is recommended. `--just_me` (default False) -: Listen on localhost only, and check incoming connections with an ident server (which must be running on port 113) to ensure they are coming from the same user. This is for experimental setups on shared Unix machines; might be useful in conjuction with `--`real_proxy. +: Listen on localhost only, and check incoming connections with an ident server (which must be running on port 113) to ensure they are coming from the same user. This is for experimental setups on shared Unix machines; might be useful in conjuction with --real_proxy. `--one_request_only` (default False) -: Shut down after handling one request. This is for use in inefficient CGI-like environments where you cannot leave a server running permanently, but still want to start one for something that's unsupported in WSGI mode (e.g. js_reproxy): run with `--`one_request_only and forward the request to its port. You may also wish to set `--`seconds if using this. +: Shut down after handling one request. This is for use in inefficient CGI-like environments where you cannot leave a server running permanently, but still want to start one for something that's unsupported in WSGI mode (e.g. js_reproxy): run with --one_request_only and forward the request to its port. You may also wish to set --seconds if using this. `--seconds` (default 0) : The maximum number of seconds for which to run the server (0 for unlimited). If a time limit is set, the server will shut itself down after the specified length of time. `--stdio` (default False) -: Forward standard input and output to our open port, in addition to being open to normal TCP connections. This might be useful in conjuction with `--`one-request-only and `--`port=-1. +: Forward standard input and output to our open port, in addition to being open to normal TCP connections. This might be useful in conjuction with --one-request-only and --port=-1. `--upstream_proxy` : address:port of a proxy to send our requests through. This can be used to adapt existing proxy-only mediators to domain rewriting, or for a caching proxy. Not used for ip_query_url options, own_server or fasterServer. If address is left blank (just :port) then localhost is assumed and https URLs will be rewritten into http with altered domains; you'll then need to set the upstream proxy to send its requests back through the adjuster (which will listen on localhost:port+1 for this purpose) to undo that rewrite. This can be used to make an existing HTTP-only proxy process HTTPS pages. @@ -208,7 +211,7 @@ General adjustment options : Code to append to the HEAD section of every HTML document that has a BODY. Use for example to add your own stylesheet links and scripts. Not added to documents that lack a BODY such as framesets. `--headAppendCSS` -: URL of a stylesheet to add to the HEAD section of every HTML document that has a BODY. This option automatically generates the LINK REL=... markup for it, and also tries to delete the string '!important' from other stylesheets, to emulate setting this stylesheet as a user CSS. Additionally, it is not affected by `--`js-upstream as headAppend is. You can also include one or more 'fields' in the URL, by marking them with %s and following the URL with options e.g. http://example.org/style%s-%s.css;1,2,3;A,B will allow combinations like style1-A.css or style3-B.css; in this case appropriate selectors are provided with the URL box (values may optionally be followed by = and a description), and any visitors who have not set their options will be redirected to the URL box to do so. +: URL of a stylesheet to add to the HEAD section of every HTML document that has a BODY. This option automatically generates the LINK REL=... markup for it, and also tries to delete the string '!important' from other stylesheets, to emulate setting this stylesheet as a user CSS. Additionally, it is not affected by --js-upstream as headAppend is. You can also include one or more 'fields' in the URL, by marking them with %s and following the URL with options e.g. http://example.org/style%s-%s.css;1,2,3;A,B will allow combinations like style1-A.css or style3-B.css; in this case appropriate selectors are provided with the URL box (values may optionally be followed by = and a description), and any visitors who have not set their options will be redirected to the URL box to do so. `--protectedCSS` : A regular expression matching URLs of stylesheets with are "protected" from having their '!important' strings deleted by headAppendCSS's logic. This can be used for example if you are adding scripts to allow the user to choose alternate CSS files in place of headAppendCSS, and you wish the alternate CSS files to have the same status as the one supplied in headAppendCSS. @@ -259,7 +262,7 @@ General adjustment options : A list of (old) browsers that cannot be relied on to process Unicode zero-width space (U+200b) correctly and need it removed from websites `--codeChanges` -: Several lines of text specifying changes that are to be made to all HTML and Javascript code files on certain sites; use as a last resort for fixing a site's scripts. This option is best set in the configuration file and surrounded by r`"""`...`"""`. The first line is a URL prefix (just "http" matches all); append a # to match an exact URL instead of a prefix, and #+number (e.g. #1 or #2) to match an exact URL and perform the change only that number of times in the page. The second line is a string of code to search for, and the third is a string to replace it with. Further groups of URL/search/replace lines may follow; blank lines and lines starting with # are ignored. If the 'URL prefix' starts with a * then it is instead a string to search for within the code of the document body; any documents containing this code will match; thus it's possible to write rules of the form 'if the code contains A, then replace B with C'. This processing takes place before any 'delete' option takes effect so it's possible to pick up on things that will be deleted, and it occurs after the domain rewriting so it's possible to change rewritten domains in the search/replace strings (but the URL prefix above should use the non-adjusted version). +: Several lines of text specifying changes that are to be made to all HTML and Javascript code files on certain sites; use as a last resort for fixing a site's scripts. This option is best set in the configuration file and surrounded by r"""...""". The first line is a URL prefix (just "http" matches all); append a # to match an exact URL instead of a prefix, and #+number (e.g. #1 or #2) to match an exact URL and perform the change only that number of times in the page. The second line is a string of code to search for, and the third is a string to replace it with. Further groups of URL/search/replace lines may follow; blank lines and lines starting with # are ignored. If the 'URL prefix' starts with a * then it is instead a string to search for within the code of the document body; any documents containing this code will match; thus it's possible to write rules of the form 'if the code contains A, then replace B with C'. This processing takes place before any 'delete' option takes effect so it's possible to pick up on things that will be deleted, and it occurs after the domain rewriting so it's possible to change rewritten domains in the search/replace strings (but the URL prefix above should use the non-adjusted version). `--boxPrompt` (default Website to adjust) : What to say before the URL box (when shown); may include HTML; for example if you've configured Web Adjuster to perform a single specialist change that can be described more precisely with some word other than 'adjust', you might want to set this. @@ -328,10 +331,10 @@ Javascript execution options ---------------------------- `--js_interpreter` -: Execute Javascript on the server for users who choose "HTML-only mode". You can set js_interpreter to PhantomJS, HeadlessChrome or HeadlessFirefox, and must have the appropriate one installed along with an appropriate version of Selenium (and ChromeDriver if you're using HeadlessChrome). If you have multiple users, beware logins etc may be shared! If a URL box cannot be displayed (no wildcard_dns and default_site is full, or processing a "real" proxy request) then htmlonly_mode auto-activates when js_interpreter is set, thus providing a way to partially Javascript-enable browsers like Lynx. If `--`viewsource is enabled then js_interpreter URLs may also be followed by .screenshot +: Execute Javascript on the server for users who choose "HTML-only mode". You can set js_interpreter to PhantomJS, HeadlessChrome or HeadlessFirefox, and must have the appropriate one installed along with an appropriate version of Selenium (and ChromeDriver if you're using HeadlessChrome). If you have multiple users, beware logins etc may be shared! If a URL box cannot be displayed (no wildcard_dns and default_site is full, or processing a "real" proxy request) then htmlonly_mode auto-activates when js_interpreter is set, thus providing a way to partially Javascript-enable browsers like Lynx. If --viewsource is enabled then js_interpreter URLs may also be followed by .screenshot `--js_upstream` (default False) -: Handle `--`headAppend, `--`bodyPrepend, `--`bodyAppend and `--`codeChanges upstream of our Javascript interpreter instead of making these changes as code is sent to the client, and make `--`staticDocs available to our interpreter as well as to the client. This is for running experimental 'bookmarklets' etc with browsers like Lynx. +: Handle --headAppend, --bodyPrepend, --bodyAppend and --codeChanges upstream of our Javascript interpreter instead of making these changes as code is sent to the client, and make --staticDocs available to our interpreter as well as to the client. This is for running experimental 'bookmarklets' etc with browsers like Lynx. `--js_frames` (default False) : When using js_interpreter, append the content of all frames and iframes to the main document. This might help with bandwidth reduction and with sites that have complex cross-frame dependencies that can be broken by sending separate requests through the adjuster. @@ -340,13 +343,13 @@ Javascript execution options : The number of virtual browsers to load when js_interpreter is in use. Increasing it will take more RAM but may aid responsiveness if you're loading multiple sites at once. `--js_429` (default True) -: Return HTTP error 429 (too many requests) if js_interpreter queue is too long at page-prefetch time. When used with `--`multicore, additionally close to new requests any core that's currently processing its full share of js_instances. +: Return HTTP error 429 (too many requests) if js_interpreter queue is too long at page-prefetch time. When used with --multicore, additionally close to new requests any core that's currently processing its full share of js_instances. `--js_restartAfter` (default 10) -: When js_interpreter is in use, restart each virtual browser after it has been used this many times (0=unlimited); might help work around excessive RAM usage in PhantomJS v2.1.1. If you have many `--`js-instances (and hardware to match) you could also try `--`js-restartAfter=1 (restart after every request) to work around runaway or unresponsive PhantomJS processes. If you have Headless Chrome you can probably set this to 0. +: When js_interpreter is in use, restart each virtual browser after it has been used this many times (0=unlimited); might help work around excessive RAM usage in PhantomJS v2.1.1. If you have many --js-instances (and hardware to match) you could also try --js-restartAfter=1 (restart after every request) to work around runaway or unresponsive PhantomJS processes. If you have Headless Chrome you can probably set this to 0. `--js_restartMins` (default 10) -: Restart an idle js_interpreter instance after about this number of minutes (0=unlimited); use this to stop the last-loaded page from consuming CPU etc indefinitely if no more requests arrive at that instance. Not applicable when `--`js-restartAfter=1. +: Restart an idle js_interpreter instance after about this number of minutes (0=unlimited); use this to stop the last-loaded page from consuming CPU etc indefinitely if no more requests arrive at that instance. Not applicable when --js-restartAfter=1. `--js_timeout1` (default 30) : When js_interpreter is in use, tell it to allow this number of seconds for initial page load. More time is allowed for XMLHttpRequest etc to finish (unless our client cuts the connection in the meantime). @@ -364,7 +367,7 @@ Javascript execution options : When js_interpreter is in use, have it send its upstream requests back through the adjuster on a different port. This allows js_interpreter to be used for POST forms, fixes its Referer headers when not using real_proxy, monitors AJAX for early completion, prevents problems with file downloads, and prefetches main pages to avoid holding up a js_interpreter instance if the remote server is down. `--js_UA` -: Custom user-agent string for js_interpreter requests, if for some reason you don't want to use the JS browser's default. If you prefix this with a * then the * is ignored and the user-agent string is set by the upstream proxy (`--`js_reproxy) so scripts running in the JS browser itself will see its original user-agent. +: Custom user-agent string for js_interpreter requests, if for some reason you don't want to use the JS browser's default. If you prefix this with a * then the * is ignored and the user-agent string is set by the upstream proxy (--js_reproxy) so scripts running in the JS browser itself will see its original user-agent. `--js_images` (default True) : When js_interpreter is in use, instruct it to fetch images just for the benefit of Javascript execution. Setting this to False saves bandwidth but misses out image onload events. @@ -379,7 +382,7 @@ Javascript execution options : When js_interpreter is in use, handle the webdriver instances in completely separate processes (not just separate threads) when the multiprocessing module is available. Recommended: if a webdriver instance gets 'stuck' in a way that somehow hangs its controlling process, we can detect and restart it. `--ssl_fork` (default False) -: (Unix only) Run SSL-helper proxies as separate processes to stop the main event loop from being stalled by buggy SSL libraries. This costs RAM, but adding `--`multicore too will limit the number of helpers to one per core instead of one per port, so `--`ssl-fork `--`multicore is recommended if you want more js_interpreter instances than cores. +: (Unix only) Run SSL-helper proxies as separate processes to stop the main event loop from being stalled by buggy SSL libraries. This costs RAM, but adding --multicore too will limit the number of helpers to one per core instead of one per port, so --ssl-fork --multicore is recommended if you want more js_interpreter instances than cores. Server control options ---------------------- @@ -409,7 +412,7 @@ Server control options : The watchdog device to use (set this to /dev/null to check main-thread responsiveness without actually pinging the watchdog) `--browser` -: The Web browser command to run. If this is set, Web Adjuster will run the specified command (which is assumed to be a web browser), and will exit when this browser exits. This is useful in conjunction with `--`real_proxy to have a personal proxy run with the browser. You still need to set the browser to use the proxy; this can sometimes be done via browser command line or environment variables. +: The Web browser command to run. If this is set, Web Adjuster will run the specified command (which is assumed to be a web browser), and will exit when this browser exits. This is useful in conjunction with --real_proxy to have a personal proxy run with the browser. You still need to set the browser to use the proxy; this can sometimes be done via browser command line or environment variables. `--run` : A command to run that is not a browser. If set, Web Adjuster will run the specified command and will restart it if it stops. The command will be stopped when Web Adjuster is shut down. This could be useful, for example, to run an upstream proxy. @@ -430,7 +433,7 @@ Media conversion options : If True, instead of recoding MP3 files unconditionally, try to add links to "lo-fi" versions immediately after each original link so you have a choice. `--pdftotext` (default False) -: If True, add links to run PDF files through the 'pdftotext' program (which must be present if this is set). A text link will be added just after any PDF link that is found, so that you have a choice of downloading PDF or text; note that pdftotext does not always manage to extract all text (you can use `--`pdfomit to specify URL patterns that should not get text links). The htmlJson setting will also be applied to the PDF link finder, and see also the guessCMS option. +: If True, add links to run PDF files through the 'pdftotext' program (which must be present if this is set). A text link will be added just after any PDF link that is found, so that you have a choice of downloading PDF or text; note that pdftotext does not always manage to extract all text (you can use --pdfomit to specify URL patterns that should not get text links). The htmlJson setting will also be applied to the PDF link finder, and see also the guessCMS option. `--pdfomit` : A comma-separated list of regular expressions which, if any are found in a PDF link's URL, will result in a text link not being generated for that PDF link (although a conversion can still be attempted if a user manually enters the modified URL). Use this to avoid confusion for PDF files you know cannot be converted. @@ -565,7 +568,7 @@ Speedup options : (Linux only) On multi-core CPUs, fork enough processes for all cores to participate in handling incoming requests. This increases RAM usage, but can help with high-load situations. Disabled on BSD/Mac due to unreliability (other cores can still be used for htmlFilter etc) `--internalPort` (default 0) -: The first port number to use for internal purposes when ssl_fork is in effect. Internal ports needed by real_proxy (for SSL) and js_reproxy are normally allocated from the ephemeral port range, but if ssl_fork delegates to independent processes then some of them need to be at known numbers. The default of 0 means one higher than 'port'; several unused ports may be needed starting at this number. If your Tornado is modern enough to support reuse_port then you can have multiple Adjuster instances listening on the same port (e.g. for one_request_only) provided they have different internalPort settings when run with ssl_fork. Note however that the `--`stop and `--`restart options will **not** distinguish between different internalPort settings, only 'port'. +: The first port number to use for internal purposes when ssl_fork is in effect. Internal ports needed by real_proxy (for SSL) and js_reproxy are normally allocated from the ephemeral port range, but if ssl_fork delegates to independent processes then some of them need to be at known numbers. The default of 0 means one higher than 'port'; several unused ports may be needed starting at this number. If your Tornado is modern enough to support reuse_port then you can have multiple Adjuster instances listening on the same port (e.g. for one_request_only) provided they have different internalPort settings when run with ssl_fork. Note however that the --stop and --restart options will **not** distinguish between different internalPort settings, only 'port'. `--fixed_ports` (default False) : Do not allocate ports (even internal ports) from the ephemeral port range even when this is otherwise possible. This option might help if you are firewalling your loopback interface and want to write specific exceptions (although that still won't work if you're using js_interpreter=HeadlessChrome or similar which opens its own ephemeral ports as well: use containers if you're concerned). Fixed ports may result in failures if internal ports are already taken. @@ -580,7 +583,7 @@ Logging options : Log timing statistics every N seconds (only when not idle) `--profile_lines` (default 5) -: Number of lines to log when profile option is in use (not applicable if using `--`multicore) +: Number of lines to log when profile option is in use (not applicable if using --multicore) `--renderLog` (default False) : Whether or not to log requests for character-set renderer images. Note that this can generate a **lot** of log entries on some pages. @@ -607,7 +610,7 @@ Logging options : What to say when an uncaught exception (due to a misconfiguration or programming error) has been logged. HTML markup is allowed in this message. If for some reason you have trouble accessing the log files, the traceback can usually be included in the page itself by placing {traceback} in the message. `--logDebug` (default False) -: Write debugging messages (to standard error if in the foreground, or to the logs if in the background). Use as an alternative to `--`logging=debug if you don't also want debug messages from other Tornado modules. On Unix you may also toggle this at runtime by sending SIGUSR1 to the process(es). +: Write debugging messages (to standard error if in the foreground, or to the logs if in the background). Use as an alternative to --logging=debug if you don't also want debug messages from other Tornado modules. On Unix you may also toggle this at runtime by sending SIGUSR1 to the process(es). Tornado-provided logging options are not listed above because they might vary across Tornado versions; run `python adjuster.py --help` to see a full list of the ones available on your setup. They typically include `log_file_max_size`, `log_file_num_backups`, `log_file_prefix` and `log_to_stderr`. diff --git a/adjuster.py b/adjuster.py index 2fdaa25..0d91086 100644 --- a/adjuster.py +++ b/adjuster.py @@ -108,9 +108,9 @@ elif '--html-options' in sys.argv or '--markdown-options' in sys.argv: return h.replace('&','&').replace('<','<').replace('>','>') else: return h help = amp(help) - for ttify in ["option=\"value\"","option='value'","\"\"\"","--"]: - if html: help=help.replace(ttify,"<nobr><kbd>"+ttify+"</kbd></nobr>") - else: help=help.replace(ttify,"`"+ttify+"`") + if html: + for ttify in ["option=\"value\"","option='value'","\"\"\"","--"]: + help=help.replace(ttify,"<nobr><kbd>"+ttify+"</kbd></nobr>") for w in ["lot","not","all","Important","between","any"]: if html: help=re.sub("(?<![A-Za-z])"+w.upper()+"(?![A-Za-z])","<strong>"+w+"</strong>",help) else: help=re.sub("(?<![A-Za-z])"+w.upper()+"(?![A-Za-z])","**"+w+"**",help) -- GitLab