r - Extract names of the levels of a factor -


this question has answer here:

i'm trying read huge matrix (2.8gb) in r, thus, far, best have found is

 require(data.table)   dt<-fread("bigmatrix.csv") 

of know nothing!

after i'm able tell matrix has 3 columns , 50 milion rows.

each row of type

             object1                       object 2           distance  1: kho.central_khoisan.gwi           kho.central_khoisan.gwi 0.0000000  2: kho.central_khoisan.gwi         kho.central_khoisan.gxana 0.2195843    3: kho.central_khoisan.gwi kho.central_khoisan.khoekhoegowab 0.6749363  4: kho.central_khoisan.gwi          kho.central_khoisan.khwe 0.6089206  5: kho.central_khoisan.gwi        kho.central_khoisan.korana 0.7163111  6: kho.central_khoisan.gwi         kho.central_khoisan.kwadi 0.8017179 

so it's comparing distances of 2 objects pairwise approximately 6900 objects

now comes problem:

i want excract pairwise comparison of 41 objects. don't know how guy gave me dataset has called these 41 objects!!

so solution find levels of dt$object1, write them in file , scan them find 41 need, how can it?

i tried

foo<-factor(dt$object1) 

so when call

foo  ....  6895 levels: aa.beja.beja aa.beja.beja_2 aa.berber.awjilah ... zun.zuni.zuni 

but

foo$levels 

gives me error!

i'm sure there smarter way in c++ (i.e. loop on each row, insert name of object 1 in vector of strings if it's not present yet), how do it?


edit: question arose:

i have identified 41 objects need, how exctract data.table rows relevant me?

i can store names of objects in data frame or vector

try: levels(as.factor(dt$object1))


Comments

Popular posts from this blog

PySide and Qt Properties: Connecting signals from Python to QML -

c# - DevExpress.Wpf.Grid.InfiniteGridSizeException was unhandled -

scala - 'wrong top statement declaration' when using slick in IntelliJ -