dplyr :: filter "tidyselect変数が登録されていません"

Question

dplyr::filter()関数を使用して、ティブルの特定の行をフィルタリングしようとしています。

これが私のティブルの一部ですhead(raw.tb)：

_A tibble: 738 x 4 geno ind X Y <chr> <chr> <int> <int> 1 san1w16 A1 467 383 2 san1w16 A1 465 378 3 san1w16 A1 464 378 4 san1w16 A1 464 377 5 san1w16 A1 464 376 6 san1w16 A1 464 375 7 san1w16 A1 463 375 8 san1w16 A1 463 374 9 san1w16 A1 463 373 10 san1w16 A1 463 372 # ... with 728 more rows _

私が求めるとき：raw.tb %>% dplyr::filter(ind == contains("A"))

取得：Error in filter_impl(.data, quo) : Evaluation error: No tidyselect variables were registered

私のtibbleではunique(raw.tb$ind)は次のとおりです。

_ [1] "A1" "A10" "A11" "A12" "A2" "A3" "A4" "A5" "A6" "A7" "A8" "A9" "B1" [14] "B10" "B11" "B12" "B2" "B3" "B4" "B5" "B6" "B7" "B8" "B9" "C1" "C10" [27] "C11" "C12" "C2" "C3" "C4" "C5" "C6" "C7" "C8" "C9" "D1" "D10" "D11" [40] "D12" "D2" "D3" "D4" "D5" "D6" "D7" "D8" "D9" "E1" "E10" "E11" "E12" [53] "E2" "E3" "E4" "E5" "E6" "E7" "E8" "E9" "F1" "F10" "F11" "F12" "F2" [66] "F3" "F4" "F5" "F6" "F7" "F8" "F9" "G1" "G10" "G11" "G2" "G3" "G4" [79] "G5" "G6" "G7" "G8" "G9" "H1" "H10" "H11" _

そして、_raw.tb$ind_が「A」で始まる行のみをtidyverse言語を使用して抽出したいと思います。

（私はベースRでそれを行う方法を知っていますが、ここでの私の目標はtidyverseを使用することです）。

フィードバックをありがとう

akrun · Accepted Answer

filterは、論理ベクトルが行をフィルタリングすることを期待しています。 selectヘルパー（?select_helpers）関数containsは、あるパターンに基づいてデータセットの列を選択します。行をフィルタリングするには、base Rからgreplを使用できます。

raw.tb %>%
   dplyr::filter(grepl("A", ind))

またはstr_detect from stringr（tidyverse内のパッケージの1つ

raw.tb %>%
  dplyr::filter(stringr::str_detect(ind, "A"))

raw.tb %>% dplyr::filter(grepl("A", ind))

またはstr_detect from stringr（tidyverse内のパッケージの1つ

raw.tb %>% dplyr::filter(stringr::str_detect(ind, "A"))

Eric Fail · Answer

単に書き出す akrunのコメント、@ akrunは念のためこの答えを自由に引き継ぐことができます。

いくつかのデータを作成し、

dput(raw.tb) raw.tb <- structure(list(geno = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "san1w16", class = "factor"), ind = structure(c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 1L), .Label = c("A1", "B1", "C1", "D1", "E1"), class = "factor"), X = c(467L, 465L, 464L, 464L, 464L, 464L, 463L, 463L, 463L, 463L), Y = c(383L, 378L, 378L, 377L, 376L, 375L, 375L, 374L, 373L, 372L)), .Names = c("geno", "ind", "X", "Y"), row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = c("tbl_df", "tbl", "data.frame" ))

データ、

raw.tb #> # A tibble: 10 x 4 #> geno ind X Y #> * <fctr> <fctr> <int> <int> #> 1 san1w16 A1 467 383 #> 2 san1w16 A1 465 378 #> 3 san1w16 B1 464 378 #> 4 san1w16 B1 464 377 #> 5 san1w16 C1 464 376 #> 6 san1w16 C1 464 375 #> 7 san1w16 D1 463 375 #> 8 san1w16 D1 463 374 #> 9 san1w16 E1 463 373 #> 10 san1w16 A1 463 372

方法＃1

raw.tb %>% dplyr::filter(str_detect(ind, "A")) #> # A tibble: 3 x 4 #> geno ind X Y #> <fctr> <fctr> <int> <int> #> 1 san1w16 A1 467 383 #> 2 san1w16 A1 465 378 #> 3 san1w16 A1 463 372

方法＃1

raw.tb %>% dplyr::filter(grepl("A", ind)) #> # A tibble: 3 x 4 #> geno ind X Y #> <fctr> <fctr> <int> <int> #> 1 san1w16 A1 467 383 #> 2 san1w16 A1 465 378 #> 3 san1w16 A1 463 372