How to replace excessive SQL wildcard by single regex pattern?

Sometowngeek :

I am creating a function that strips the illegal wildcard patterns from the input string. The ideal solution should use a single regex expression, if at all possible.

The illegal wildcard patterns are: %% and %_%. Each instance of those should be replaced with %.

Here's the rub... I'm trying to perform some fuzz testing by running the function against various inputs to try to make it and break it.

It works for the most part; however, with complicated inputs, it doesn't.

The rest of this question has been updated:

The following inputs should return empty string (not an exhaustive list):

The following inputs should return % (not an exhaustive list).

  • %_%
  • %%
  • %%_%%
  • %_%%%
  • %%_%_%
  • %%_%%%_%%%_%

There will be cases where there are other characters with the input... like:

  • Foo123%_%
    • Should return "Foo123%"
  • B4r$%_%
    • Should return "B4r$%"
  • B4rs%%_%
    • Should return "B4rs%"
  • %%Lorem_%%
    • Should return "%Lorem_%"

I have tried using several different patterns and my tests are failing.

String input = "%_%%%%_%%%_%";

// old method:
public static String ancientMethod1(String input){
    if (input == null)
        return "";
    return input.replaceAll("%_%", "").replaceAll("%%", "");  // Output: ""
}

// Attempt 1:
// Doesn't quite work right.
// "A%%" is returned as "A%%" instead of "A%"
public static String newMethod1(String input) {
    String result = input;
    while (result.contains("%%") || result.contains("%_%"))
        result = result.replaceAll("%%","%").replaceAll("%_%","%");
    if (result.equals("%"))
        return "";
    return input;
}

// Attempt 2:
// Succeeds, but I would like to simplify this:
public static String newMethod2(String input) {
    if (input == null)
        return "";

    String illegalPattern1 = "%%";
    String illegalPattern2 = "%_%";
    String result = input;

    while (result.contains(illegalPattern1) || result.contains(illegalPattern2)) {
        result = result.replace(illegalPattern1, "%");
        result = result.replace(illegalPattern2, "%");
    }

    if (result.equals("%") || result.equals("_"))
        return "";

    return result;
}

Here's a more complete defined example of how I'm using this: https://gist.github.com/sometowngeek/697c839a1bf1c9ee58be283b1396cf2e

John Bollinger :

This regular expression string matches all your examples:

"%(?:_?%)+"

It matches strings consisting of a '%' character followed by one or more sequences consisting of zero or one '_' character and one '%' character (close to literal translation), which is another way of saying what I did in comments: "a sequence of '%' and '_' characters, beginning and ending with '%', and not containing two consecutive '_' characters".

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related