regex - Data frame column vector manipulation -
i have dataframe mydf:
content term 1 search term: abc| na 2 search term-xyz na 3 search term-pqr| na
made regex:
\search term[:]?.?([a-za-z]+)\
to terms abc xyz , pqr.
how extract these terms in term column. tried str_match , gsub, not getting correct results.
we can try sub
sub(".*(\\s+|-)", "", df1$content) #[1] "abc" "xyz" "pqr"
or
library(stringr) str_extract(df1$content, "\\w+$") #[1] "abc" "xyz" "pqr"
update
if |
found in string @ end
gsub(".*(\\s+|-)|[^a-z]+$", "", df1$content) #[1] "abc" "xyz" "pqr"
or
str_extract(df1$content, "\\w+(?=(|[|])$)") #[1] "abc" "xyz" "pqr"
Comments
Post a Comment