Java regex pattern not working as intended

Uebertreiberman

I am new to patterns and regex and have encountered a problem which I can't solve. This is my code:

public static void main(String[] args) {

    Pattern pattern = Pattern.compile("(!?)(fw|ri|le|cl|rs)[\\s,]*(\\d*\\.*\\d*|\"\\w*\")?[\\s,]*(\\d*\\.*\\d*|\"\\w*\")?[\\s,]*(\\d*\\.*\\d*|\"\\w*\")?");
    Matcher matcher = pattern.matcher("!fw 90.0 \"hello\" 70.0");

    matcher.find();
    for(int i = 0; i < matcher.groupCount()+1; i++) {
        System.out.println("Group "+i+") " + matcher.group(i));
    }
}

So, i've used regexr.com to create the regex, and on the website it works as planned. It should find 3 arguments which can be either a number or a String, where the String is enclosed in quotation mark. As I said, on regexr.com it works, however in java it does only, when there are no Strings. What am I doing wrong? (The regex without the extra backslashes is (!?)(fw|ri|le|cl|rs)[\s,]*(\d*\.*\d*|"\w*")?[\s,]*(\d*\.*\d*|"\w*")?[\s,]*(\d*\.*\d*|"\w*")? )

Thanks in advance.

Edit: Some examples of what does happen and what doesn't:

Working as intended:

Input: !fw 1.0 2.0 3.0

Ouput: Group 0) !fw 1.0 2.0 3.0 Group 1) ! Group 2) fw Group 3) 1.0 Group 4) 2.0 Group 5) 3.0

Not working as intended:

Input: !fw 1.0 \"hello\" 3.0

Output: Group 0) !fw 1.0 Group 1) ! Group 2) fw Group 3) 1.0 Group 4) Group 5)

Intended Output: Group 0) !fw 1.0 "hello" 3.0 Group 1) ! Group 2) fw Group 3) 1.0 Group 4) "hello" Group 5) 3.0

Cecilya

You can get your regex to work if you switch the order of the expression for Strings and numbers:

(!?)(fw|ri|le|cl|rs)[\\s,]*(\"\\w*\"|\\d*\\.*\\d*)?[\\s,]*(\"\\w*\"|\\d*\\.*\\d*)?[\\s,]*(\"\\w*\"\\d*\\.*\\d*)?

However, I'm not sure your regex does exactly what you want it to do - it matches a lot more, to be more specific. E.g.:

!fw ...""

This is because so much in your regex is optional or can be repeated any number of times. (Like the dot, which I'm guessing is not what you intended.) Assuming you want to have a exactly 3 groups of either String or a number with optional decimal point and either a whitespace, a comma or nothing separating them, you should use this regex:

(!?)(fw|ri|le|cl|rs)([\\s,]*(\"\\w*\"|\\d+(\\.\\d+)?)[\\s,]*){3}

This will match Strings such as:

!fw 90.0 \"hello\" 70.0

!fw \"hello\" 70.0

!fw\"hello\"70.0

but will not match

!fw ...\"\"

This is because in your regex, you specify \\d*\\.*\\d*, which means "0-n numbers, 0-n dots, 0-n numbers". By changing \\.* to \\.? you specify "0-1 dots", which takes care of your dot problem. But you would still match . or .9 to this regex, which is why you make the first number compulsive with a + and then add an optional argument for decimal points (\\.d+)?, which means "1 dot and 1-n numbers". Now it will match numbers without decimal points and numbers with a decimal point but not numbers such as 3. or .3.

The {3} specifies that you want exactly three occurrences of this group. If you kept these groups optional with a * you would also get results for input with 0-2 occurrences of your pattern. If this is your intended behaviour you should consider whether you want to allow multiple whitespaces or commas to appear between you numbers/Strings. If no, you should make them dependant on whether there was a String/number before.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related