-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New regex tester (runs in a Docker container) #6
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Txt2regex needs to know regex-related information for each program it supports. For example: the list of metacharacters, how to escape a metacharacter to match it literally, availability of POSIX character classes. Instead of relying in documentation to get that information, there's a new `tests/regex-tester.sh` script that calls the real programs with specially crafted regexes and sample texts, verifying how those programs behave in "real life". To have a permanent record, the output of this script is also saved to this repository. This way we can detect changes in behavior when a program version is updated. To avoid having to install specific software in the developer machine, a Docker image is used to isolate all the necessary software and this script is run inside that image (via `make test-regex`). - Remove all the obsoleted files from the old tester: - test-suite/javascript.html - test-suite/procmail-re-test.sh - test-suite/result.txt - test-suite/test-suite.sh - New `tests/regex-tester.sh` script - New `tests/regex-tester.txt` script output record - New `tests/Dockerfile` image with all the txt2regex-supported programs installed and ready to be used - New make target: `test-regex` to run the tester and save to the output file In this commit, the new regex tester is supporting all the programs that the previous `test-suite.sh` script used to support. New programs will be added in following commits.
This new option accepts a program name to be skipped. Useful to test "all but one". This will be handy in the next commit, when adding support for vi.
Thanks Mario Domenech Goulart for the magical command and guidance.
Thanks Mario Domenech Goulart for the magical command and guidance.
All new topic about the new regex tester. Now the list of programs versions is the output of a command, and that command is checked to be correct by clitest (which is run in the CI). In other words: that list will always be up-to-date now.
This could mask problems, since it could be expanded to some special char in both sides (regex and string), giving false positives. This is also a metacharacter for border in some tools.
- Always use raw strings for the "string" argument, if the program supports it. - As a fallback, use the new `escape()` function to escape the '\' chars. Note that the "regex" argument should not be escaped, since the goal of the "brute force" tests is exactly discovering how many '\' are necessary to properly match a pattern.
It was a mistake not enforcing full matches since the start. Partial matches are a problem when test_type=match. This is just an intermediary step to make the .txt diff easier to see if there are behavior changes (none detected on visual inspection). The next commit will change all the actual regexes to be fully anchored (explicit is better than implicit) and the .txt file should not change in behavior (only the regexes will have the $ added)
Now add $ to the end of all the test regexes. See parent commit for details.
Now add ^ to the start of all the test regexes. See "part 1" commit for details.
Now that all the regexes are anchored, some tests became irrelevant. Also remove the tests for ^ and $ being at regex start/end, since they do not fit in the new "^...$" format for all regexes.
aureliojargas
added a commit
that referenced
this pull request
Sep 13, 2022
Since #6, the regex rules are based on the results of actually running the programs in a Docker container, using a special script. So now it's mandatory to first add the new program to that container and properly test it. The process became even more complex as before, but now it's reliable and tested.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Txt2regex needs to know regex-related information for each program it
supports. For example: the list of metacharacters, how to escape a
metacharacter to match it literally and the availability of POSIX
character classes.
Instead of relying in documentation to get that information, there's a
new
tests/regex-tester.sh
script that calls the real programs withspecially crafted regexes and sample texts, verifying how those programs
behave in "real life".
To have a trackable and public record, the output of this tester is also
saved to this repository, in a readable and grepable plain text file.
This way we can detect changes in behavior when a program version is
updated.
To avoid having to install specific software in the developer machine, a
Docker image is used to isolate all the necessary software and this
script is run inside that image (via
make test-regex
).Remove all the obsoleted files from the old tester:
test-suite/javascript.html
test-suite/procmail-re-test.sh
test-suite/result.txt
test-suite/test-suite.sh
New
tests/regex-tester.sh
scriptNew
tests/regex-tester.txt
script output recordNew
tests/Dockerfile
image with all the txt2regex-supported programsinstalled and ready to be used
New make target:
test-regex
to run the tester and save to the outputfile
New make target:
test-regex-shell
to enter the interactive shellinside the test container.