Encontre a localização de um caractere na string

Question 1

Eu gostaria de encontrar a localização de um caractere em uma string.

Dizer: string = "the2quickbrownfoxeswere2tired"

Eu gostaria que a função retornasse 4e 24- a localização do caractere do 2s em string.

Question 2

Você pode usar gregexpr

 gregexpr(pattern ='2',"the2quickbrownfoxeswere2tired")


[[1]]
[1]  4 24
attr(,"match.length")
[1] 1 1
attr(,"useBytes")
[1] TRUE

ou talvez str_locate_alldo pacote stringrque é um wrapper para (a partir da versão 1.0)~~gregexpr~~ stringi::stri_locate_allstringr

library(stringr)
str_locate_all(pattern ='2', "the2quickbrownfoxeswere2tired")

[[1]]
     start end
[1,]     4   4
[2,]    24  24

observe que você pode simplesmente usar stringi

library(stringi)
stri_locate_all(pattern = '2', "the2quickbrownfoxeswere2tired", fixed = TRUE)

Outra opção na base Rseria algo como

lapply(strsplit(x, ''), function(x) which(x == '2'))

deve funcionar (dado um vetor de caracteres x)

Question 3

Aqui está outra alternativa direta.

> which(strsplit(string, "")[[1]]=="2")
[1]  4 24

Question 4

Você pode fazer a saída apenas 4 e 24 usando não listar:

unlist(gregexpr(pattern ='2',"the2quickbrownfoxeswere2tired"))
[1]  4 24

Question 5

encontre a posição da enésima ocorrência de str2 em str1 (mesma ordem de parâmetros do Oracle SQL INSTR), retorna 0 se não for encontrado

instr <- function(str1,str2,startpos=1,n=1){
    aa=unlist(strsplit(substring(str1,startpos),str2))
    if(length(aa) < n+1 ) return(0);
    return(sum(nchar(aa[1:n])) + startpos+(n-1)*nchar(str2) )
}


instr('xxabcdefabdddfabx','ab')
[1] 3
instr('xxabcdefabdddfabx','ab',1,3)
[1] 15
instr('xxabcdefabdddfabx','xx',2,1)
[1] 0

Question 6

Para encontrar apenas os primeiros locais, use lapply()com min():

my_string <- c("test1", "test1test1", "test1test1test1")

unlist(lapply(gregexpr(pattern = '1', my_string), min))
#> [1] 5 5 5

# or the readable tidyverse form
my_string %>%
  gregexpr(pattern = '1') %>%
  lapply(min) %>%
  unlist()
#> [1] 5 5 5

Para encontrar apenas os últimos locais, use lapply()com max():

unlist(lapply(gregexpr(pattern = '1', my_string), max))
#> [1]  5 10 15

# or the readable tidyverse form
my_string %>%
  gregexpr(pattern = '1') %>%
  lapply(max) %>%
  unlist()
#> [1]  5 10 15

Question 7

Você também pode usar grep:

grep('2', strsplit(string, '')[[1]])
#4 24

Answer 1

87

Eu gostaria de encontrar a localização de um caractere em uma string.

Dizer: string = "the2quickbrownfoxeswere2tired"

Eu gostaria que a função retornasse 4e 24- a localização do caractere do 2s em string.

regex string r ricardo
fonte

Por que usar um regex? R não tem um .indexOf()ou algo assim?

fge

1

Eu duvido. Os desenvolvedores eram Nixers e presumiram que todos conheciam regex. O manuseio da string de R é meio confuso.

IRTFM

Answer 2

Por que usar um regex? R não tem um .indexOf()ou algo assim?

fge

Answer 3

1

Eu duvido. Os desenvolvedores eram Nixers e presumiram que todos conheciam regex. O manuseio da string de R é meio confuso.

IRTFM

Answer 4

Você pode usar gregexpr

 gregexpr(pattern ='2',"the2quickbrownfoxeswere2tired")


[[1]]
[1]  4 24
attr(,"match.length")
[1] 1 1
attr(,"useBytes")
[1] TRUE

ou talvez str_locate_alldo pacote stringrque é um wrapper para (a partir da versão 1.0)~~gregexpr~~ stringi::stri_locate_allstringr

library(stringr)
str_locate_all(pattern ='2', "the2quickbrownfoxeswere2tired")

[[1]]
     start end
[1,]     4   4
[2,]    24  24

observe que você pode simplesmente usar stringi

library(stringi)
stri_locate_all(pattern = '2', "the2quickbrownfoxeswere2tired", fixed = TRUE)

Outra opção na base Rseria algo como

lapply(strsplit(x, ''), function(x) which(x == '2'))

deve funcionar (dado um vetor de caracteres x)

Answer 5

como podemos extrair os inteiros das listas / objetos retornados por suas três primeiras soluções?

3pitt de

Answer 6

Use em regexprvez de gregexprpara obter os inteiros facilmente. Ou use unlistna saída conforme indicado em outra resposta abaixo.

Arani,

Answer 7

41

Aqui está outra alternativa direta.

> which(strsplit(string, "")[[1]]=="2")
[1]  4 24

Jilber Urbina
fonte

Você pode explicar o que [[1]]faz?

francoiskroll

@francoiskroll, [[1]] representa o primeiro elemento da lista.

Prafulla de

Answer 8