r - Last match of regular expression

ecolog

I am looking for an R pattern matching expression that extracts the last fully populated taxonomy in each element in the list. The taxonomies have always the same format (one letter two underscores and a word (some times inside square brackets). Taxonomies that are not fully populated they don't have the word after the two underscores.

I was able to build a expression that worked in one regular expression builder website
(.\_\_[A-Za-z\[\]]+)(?!.*__[A-Za-z\[\])
but had not luck using it or transforming it to use an R pattern matching methods in grep {base} or anything similar. Here is one of the things I tried

clean=gsub("(.\_\_[A-Za-z[]]+)(?!.*__[A-Za-z[]])","\\1",taxonomies,perl = TRUE)

Any suggestions? Thanks!

taxonomies=
  list('k__Bacteria; p__Bacteroidetes; c__[Saprospirae]; o__[Saprospirales]; f__Chitinophagaceae; g__; s__'
       ,'k__Bacteria; p__Actinobacteria; c__MB-A2-108; o__0319-7L14; f__; g__; s__'
       ,'k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales;f__Corynebacteriaceae; g__Corynebacterium; s__'
       ,'k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__Rhodocyclales; f__Rhodocyclaceae; g__Methyloversatilis; s__'
       ,'k__Bacteria; p__Proteobacteria; c__Deltaproteobacteria; o__Myxococcales; f__; g__; s__'
       ,'k__Bacteria; p__Proteobacteria; c__[Deltaproteobacteria]; o__[W123]; f__[W123]; g__[W123]; s__[W123.012.123]'
       ,'k__Bacteria; p__Bacteroidetes; c__[Saprospirae]; o__[Saprospirales]; f__Chitinophagaceae')

Desired output

[1] "f__Chitinophagaceae"  "o__0319-7L14" "g__Corynebacterium"   
[4] "g__Methyloversatilis" "o__Myxococcales"  "s__[W123.012.123]"   
[7] "f__Chitinophagaceae" 

Edit Included desired output, example code gsub that is not working.

akrun

We can use stri_extract_last from stringi

library(stringi)
stri_extract_last(unlist(taxonomies), regex = '[A-Za-z]__\\[*[[:alnum:].-]+\\]*')
#[1] "f__Chitinophagaceae"  "o__0319-7L14" "g__Corynebacterium"   
#[4] "g__Methyloversatilis" "o__Myxococcales"  "s__[W123.012.123]"   
#[7] "f__Chitinophagaceae" 

Here, I assumed that the OP meant to extract the characters within **...**. It must be some formatting issue as it was not shown in BOLD.

data

taxonomies=list(
  'k__Bacteria; p__Bacteroidetes; c__[Saprospirae]; o__[Saprospirales]; f__Chitinophagaceae; g__; s__'
  ,'k__Bacteria; p__Actinobacteria; c__MB-A2-108; o__0319-7L14; f__; g__; s__'
  ,'k__Bacteria; p__Actinobacteria; c__Actinobacteria; o__Actinomycetales;f__Corynebacteriaceae; g__Corynebacterium; s__'
 ,'k__Bacteria; p__Proteobacteria; c__Betaproteobacteria; o__Rhodocyclales; f__Rhodocyclaceae; g__Methyloversatilis; s__'
 ,'k__Bacteria; p__Proteobacteria; c__Deltaproteobacteria; o__Myxococcales; f__; g__; s__'
  ,'k__Bacteria; p__Proteobacteria; c__[Deltaproteobacteria]; o__[W123]; f__[W123]; g__[W123]; s__[W123.012.123]'
  ,'k__Bacteria; p__Bacteroidetes; c__[Saprospirae]; o__[Saprospirales]; f__Chitinophagaceae'
  )

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

TOP Ranking

  1. 1

    Failed to listen on localhost:8000 (reason: Cannot assign requested address)

  2. 2

    Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

  3. 3

    How to import an asset in swift using Bundle.main.path() in a react-native native module

  4. 4

    pump.io port in URL

  5. 5

    Compiler error CS0246 (type or namespace not found) on using Ninject in ASP.NET vNext

  6. 6

    BigQuery - concatenate ignoring NULL

  7. 7

    ngClass error (Can't bind ngClass since it isn't a known property of div) in Angular 11.0.3

  8. 8

    ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

  9. 9

    Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

  10. 10

    How to remove the extra space from right in a webview?

  11. 11

    java.lang.NullPointerException: Cannot read the array length because "<local3>" is null

  12. 12

    Jquery different data trapped from direct mousedown event and simulation via $(this).trigger('mousedown');

  13. 13

    flutter: dropdown item programmatically unselect problem

  14. 14

    How to use merge windows unallocated space into Ubuntu using GParted?

  15. 15

    Change dd-mm-yyyy date format of dataframe date column to yyyy-mm-dd

  16. 16

    Nuget add packages gives access denied errors

  17. 17

    Svchost high CPU from Microsoft.BingWeather app errors

  18. 18

    Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

  19. 19

    12.04.3--- Dconf Editor won't show com>canonical>unity option

  20. 20

    Any way to remove trailing whitespace *FOR EDITED* lines in Eclipse [for Java]?

  21. 21

    maven-jaxb2-plugin cannot generate classes due to two declarations cause a collision in ObjectFactory class

HotTag

Archive