Golang Regular expression always returns false?

cbll :

I am taking a user input(a regular expression), and checking to see if a given line of a file would match it. I then return some ID if there's a match(the ID of the line), and that's about it. However, it appears as my match always returns false? But, interestingly, if I throw a wildcard .*, the program will take significantly longer to execute than a specific regular expression. So, there must be something going on -- why does it always return false?

Sample code:

func main() {

    // User input from command line
    reader := bufio.NewReader(os.Stdin)
    fmt.Print("Enter regexp: ")
    userRegexp, _ := reader.ReadString('\n')

    // List all .html files in static dir
    files, err := filepath.Glob("static/*.html")
    if err != nil {
        log.Fatal(err)
    }

    // Empty array of int64's to be returned with matching results
    var lineIdArr []int64

    for _, file := range files {
        htmlFile, _ := os.Open(file)
        fscanner := bufio.NewScanner(htmlFile)

        // Loop over each line
        for fscanner.Scan() {

            line := fscanner.Text()

            match := matchLineByValue(userRegexp, line) // This is always false?

            // ID is always the first item. Seperate by ":" and cast it to int64.
            lineIdStr := line[:strings.IndexByte(line, ':')]
            lineIdInt, err := strconv.ParseInt(lineIdStr, 10, 64)

            if err != nil {
                panic(err)
            }

            // If matched, append ID to lineIdArr
            if match {
                lineIdArr = append(lineIdArr, lineIdInt)
            }
        }
    }
    fmt.Println("Return array: ", lineIdArr)
    fmt.Println("Using regular expression: ", userRegexp)
}

func matchLineByValue(re string, s string) bool {
    return regexp.MustCompile(re).MatchString(s)
}

is regexp.MustCompile(re).MatchString(s) not the right way to construct a regular expression from user input and match it to a whole line?

The string it matches is fairly long(it's basically a whole html file), would that present an issue?

Muffin Top :

The call userRegexp, _ := reader.ReadString('\n') returns a string with a trailing newline. Trim the newline:

 userRegexp, err := reader.ReadString('\n')
 if err != nil {
    // handle error
 }
 userRegexp = userRegexp[:len(userRegexp)-1]

Here's the code with some other improvements (compile regexp once, use scanner Bytes):

// User input from command line
reader := bufio.NewReader(os.Stdin)
fmt.Print("Enter regexp: ")
userRegexp, err := reader.ReadString('\n')
if err != nil {
    log.Fatal(err)
}
userRegexp = userRegexp[:len(userRegexp)-1]
re, err := regexp.Compile(userRegexp)
if err != nil {
    log.Fatal(err)
}

// List all .html files in static dir
files, err := filepath.Glob("static/*.html")
if err != nil {
    log.Fatal(err)
}

// Empty array of int64's to be returned with matching results
var lineIdArr []int64

for _, file := range files {
    htmlFile, _ := os.Open(file)
    fscanner := bufio.NewScanner(htmlFile)
    // Loop over each line
    for fscanner.Scan() {
        line := fscanner.Bytes()
        if !re.Match(line) {
            continue
        }
        lineIdStr := line[:bytes.IndexByte(line, ':')]
        lineIdInt, err := strconv.ParseInt(string(lineIdStr), 10, 64)
        if err != nil {
            log.Fatal(err)
        }
        lineIdArr = append(lineIdArr, lineIdInt)
    }
}
fmt.Println("Return array: ", lineIdArr)
fmt.Println("Using regular expression: ", userRegexp)

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related