On Fri, 5 Apr 2002, Dr . Sharukh K . R . Pavri . wrote:
I think manish has already answered your questions correctly, I'll just add to that.
I downloaded your lecture on regex from the ACM website. I have a query; I am a bit confused with the following lines from your examples.
[A-Z^] - match anything from A-Z or ^ [-A-Z] - match anything from A-Z or- [+-*/] - match anything from + to * and / [-+*/] - match - + * /
square brackets mean match any one character within the square brackets. The ^ and - characters have special meaning inside square brackets.
If - is not the first or last character, then it defines a range.
ie, A-Z means from A to Z. You can have multiple ranges and intermix ranges with other lists, so you can do this:
[0-9a-fA-F] to match a hex digit or even this: [abc0-9defA-EF] - both match the same.
when - is not first or last, it matches a literal -.
^ when used as the first character in the class means invert the match.
therefore, [^A-Z] means any character not in A-Z. Note, this is not the same as matching no character. There must necessarily be a character to match, and that character must be anything other than A-Z.
the closing bracket also has special meaning for obvious reasons. To include a closing square bracket, put it as the first character in your class:
[]A-Z]
opening square brackets don't have this problem.
For square bracket and -, put the - last For inverted match on square bracket, there shouldn't be an issue, but I haven't tried it.
Also, keep in mind that different regex engines interpret things differently.