从Java中的源代码删除评论

阿南德:

我想从一个Java源代码文件中删除所有类型的注释语句。例:

    String str1 = "SUM 10"      /*This is a Comments */ ;   
    String str2 = "SUM 10";     //This is a Comments"  
    String str3 = "http://google.com";   /*This is a Comments*/
    String str4 = "('file:///xghsghsh.html/')";  //Comments
    String str5 = "{\"temperature\": {\"type\"}}";  //comments

预期输出:

    String str1 = "SUM 10"; 
    String str2 = "SUM 10";  
    String str3 = "http://google.com";
    String str4 = "('file:///xghsghsh.html/')";
    String str5 = "{\"temperature\": {\"type\"}}";

我使用下面的正则表达式来实现:

    System.out.println(str1.replaceAll("[^:]//.*|/\\\\*((?!=*/)(?s:.))+\\\\*/", ""));

这让我对STR4和STR5错误的结果。请帮我解决这个问题。

使用安德烈亚斯解决方案:

        final String regex = "//.*|/\\*(?s:.*?)\\*/|(\"(?:(?<!\\\\)(?:\\\\\\\\)*\\\\\"|[^\\r\\n\"])*\")";
        final String string = "    String str1 = \"SUM 10\"      /*This is a Comments */ ;   \n"
             + "    String str2 = \"SUM 10\";     //This is a Comments\"  \n"
             + "    String str3 = \"http://google.com\";   /*This is a Comments*/\n"
             + "    String str4 = \"('file:///xghsghsh.html/')\";  //Comments\n"
             + "    String str5 = \"{\"temperature\": {\"type\"}}";  //comments";
        final String subst = "$1";

        // The substituted value will be contained in the result variable
        final String result = string.replaceAll(regex,subst);

        System.out.println("Substitution result: " + result);

它除了STR5工作。

安德烈亚斯:

为了使它工作,你需要“跳过”字符串文字。您可以通过匹配字符串文字,捕捉他们,使他们可以保留这样做。

下面的正则表达式将做到这一点,利用$1作为替换字符串:

//.*|/\*(?s:.*?)\*/|("(?:(?<!\\)(?:\\\\)*\\"|[^\r\n"])*")

regex101的演示。

Java代码则是:

str1.replaceAll("//.*|/\\*(?s:.*?)\\*/|(\"(?:(?<!\\\\)(?:\\\\\\\\)*\\\\\"|[^\r\n\"])*\")", "$1")

说明

//.*                      Match // and rest of line
|                        or
/\*(?s:.*?)\*/            Match /* and */, with any characters in-between, incl. linebreaks
|                        or
("                        Start capture group and match "
  (?:                      Start repeating group:
     (?<!\\)(?:\\\\)*\\"     Match escaped " optionally prefixed by escaped \'s
     |                      or
     [^\r\n"]                Match any character except " and linebreak
  )*                       End of repeating group
")                        Match terminating ", and end of capture group
$1                        Keep captured string literal

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章