regex - Data frame column vector manipulation -
i have dataframe mydf:
                content    term     1 search term: abc|    na     2 search term-xyz      na     3 search term-pqr|     na   made regex:
\search term[:]?.?([a-za-z]+)\    to terms abc xyz , pqr.
how extract these terms in term column. tried str_match , gsub, not getting correct results.
we can try sub
sub(".*(\\s+|-)", "", df1$content) #[1] "abc" "xyz" "pqr"   or
library(stringr) str_extract(df1$content, "\\w+$") #[1] "abc" "xyz" "pqr"   update
if | found in string @ end
gsub(".*(\\s+|-)|[^a-z]+$", "", df1$content) #[1] "abc" "xyz" "pqr"   or
 str_extract(df1$content, "\\w+(?=(|[|])$)")  #[1] "abc" "xyz" "pqr"      
Comments
Post a Comment