Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
I'm devastatingly miserable at complex regular expressions, but I would love a nudge in the right direction. I'm trying to parse some authors' names by removing initials, when the full names are used later. I realize there probably won't be a "perfect" solution that catches all exceptions, but I'm looking for a "good enough" solution.
Example input
C S Clive Staples Lewis T H Terence Hanbury White R Salvatore George R R Martin J R R John Ronald Reuel Tolkien J K Rowling
Ideal output
Clive Staples Lewis Terence Hanbury White R Salvatore George R R Martin John Ronald Reuel Tolkien J K Rowling
Something along the lines of this: $str = preg_replace('#(?:\s+\S{1,2})+\s+#',' ',$str);
though this is obviously missing the first instance of the single character, but changing that would remove the r in r salvatore and the j k in j k rowling.
Thank you for any insight.
Answer by anubhava for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
You can use it like this:
$str = 'C S Clive Staples Lewis'; $str = preg_replace('#^([A-Z]\s)+(?=([A-Z]+\s+){2,})#i','',$str); echo $str; // Clive Staples Lewis $str = 'J K Rowling'; $str = preg_replace('#^([A-Z]\s)+(?=([A-Z]+\s+){2,})#i','',$str); echo $str; // J K Rowling
Answer by Casimir et Hippolyte for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
You can use this:
$result = preg_replace('~^(?:[A-Z]\h){2,}~m', '', $str);
If you want to put exceptions you can do that:
$str = << J \h K \h Rowling | J \h F \h Kennedy | C \h P \h E \h Bach ) ) # pattern ^(?!\g) (?:[A-Z]\h){2,} ~xm LOD; $result = preg_replace($pattern, '', $str);
Answer by SmokeyPHP for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
This seems to do what you're after:
var t = [ 'C S Clive Staples Lewis' ,'T H Terence Hanbury White' ,'R Salvatore' ,'George R R Martin' ,'J R R John Ronald Reuel Tolkien' ,'J K Rowling' ]; for(var i=0,c=t.length;i
Do note, however, that this method is limited to 3 initials (though I can't see you ever having more than that!)
On the plus side, this is checking that initials are matched up to a name starting with that letter before removing them
If you need PHP:
$t = array( 'C S Clive Staples Lewis' ,'T H Terence Hanbury White' ,'R Salvatore' ,'George R R Martin' ,'J R R John Ronald Reuel Tolkien' ,'J K Rowling' ); for($i=0,$c=count($t);$i<$c;$i++) { $newStr = preg_replace('/^([A-Z]) ([A-Z])((?: [A-Z])?) (\1\w+ \2\w+( \3\w+)?.+)$/','$4',$t[$i]); var_dump($newStr); }
Answer by gpmurthy for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
Consider the following Regex...
(?(^(\w\s)+\w{2,}(\s\w{2,}){1,})^(\w\s)+)
Answer by edi_allen for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
Even though you are using PHP you did not specify a language. So this is a sample in Perl.
use strict; use warnings; open my $data_fh, '<', 'Data1.txt' or die "Can't open Data1.txt $!"; while (my $line = <$data_fh>) { $line =~ s/\b([A-Z])\b (?=.*?\b\1[A-Z]+\b)//xig; # Match an initial only if there is a word starting with that initial later in the string. $line =~ s/^\s*|\s*$//g; #strip leading or trailing space. print "$line\n"; } #OUTPUT Clive Staples Lewis Terence Hanbury White R Salvatore George R R Martin John Ronald Reuel Tolkien J K Rowling
Answer by Teneff for Regular Expression to remove single characters from the beginning of string, only if there are 2 or more
You can use the following regular expression:
^(?:([A-Z])(?=.*?\1[a-z]+)\s)+
It will match:
^ // from the beginning of the string (?: // non-capturing group ([A-Z]) // cature uppercase string (?=.*?\1[a-z]+) // positive lookahead for the letter captured above followed by multiple lowercase characters \s // followed by a space )+ // multiple times
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment