Get a specific regex using PhP (Song names)
Get a specific regex using PhP (Song names)
I've been struggling quite some while now with this specific regex.
A quick background:
I downloaded lots of songs gave them proper names etc. but now I would like them in a database for me to practice AJAX, JSON, SQL and PhP.
Every song has the same build up.
ARTIST - SONGNAME ft. ARTIST (ARTIST Remix)
All the italic are optional. So far I managed to get the following regex to get me some data but it's not enough.
/(.*) - (.*) [ft\.]* (.*)/
However that requires a 'ft.' to work and that part is optional. Then I decided to do multiple regexes but I never got past the artist name and song name which still leaves me with the ft and () part.
I've been using http://www.phpliveregex.com/ to real time practice some of the songs.
Here are some examples of song names I want to filter :
Armin van Buuren - Rain ft. Cathy Burton (Urbanstep Remix).mp3
Alpha Drop - Spring Fever.mp3
Beatcore - Tonight ft. Lynn Boyer.mp3
iru1919 - ??.mp3
Answer by Jay Blanchard for Get a specific regex using PhP (Song names)
You can make the group optional by adding a question mark after a group:
(ft\.)?
In some cases you can use curly brackets:
{ft\.}?
The ?
is known as a quantifier.
Answer by fusion3k for Get a specific regex using PhP (Song names)
You have to use a regex like this one:
/(.*) - (.*)( (ft\.)? (.*))?(\([^)]+\))?/
Your regex fails because [ft\.]*
that means 'any of f,t or .', but also because the space after (.*) - (.*)
doesn't match 2nd and 4th example.
Edit:
At the end, I think this regex is better than the first I've posted above:
/(.+) - ((?:(?!(ft\.|\()).)+)( *ft\.[^\(]+)?( *\(([^)]+)\))?\.([^.]+)$/
It match separately the artist, the title, the eventual ft. artist, the eventually remix and the file extension.
Please note that if there are parenthesis in the title song or featuring artist (it's possible) the match fails.
I'm not a regex expert, so my solution is rude and I sure than there are a better solution.
Answer by Dave F for Get a specific regex using PhP (Song names)
Definitions:
next input: The string that is to be examined further for song information.
First I would separate the artist from the rest of the song title:
/(.*) - (.*)\.mp3$/
The first backreference is the 'ARTIST'. The second one is the next input.
Next I would search for an 'ARTIST Remix' (because this is easiest to search for next):
/([^(]*)( \(([^)]*)\))?$/
The first backreference is the next input. The third backreference, which is ([^)]*)
, refers to the 'ARTIST Remix'. The second backrefence can be ignored because it isn't needed. It is the space followed by 'ARTIST Remix' in brackets.
Now you can search for the featured 'ARTIST':
/(.*) ft\. (.*)/
If there is a featured 'ARTIST', the first backreference is the 'SONGNAME' and the second is the featured 'ARTIST'. However, if there is no featured 'ARTIST', then you'll get an empty array because there is no match.
When there is no featured 'ARTIST', or, more specifically, no occurrence of ft.
, next input, the remaining string that was to be examined, is the 'SONGNAME'.
Answer by trincot for Get a specific regex using PhP (Song names)
As others have stated, [ft\.]*
will match any of the listed characters, in any order, any number of times.
I propose this regex:
/^(.+?)\h+-\h+(.+?)(?:\h+(?:ft\.)?\h+(.*?))?\h*(?:\((.*?)\))?\.mp3$/
Break-down:
^
: start of string(.+?)
: one or more characters (non-greedy), captured as group 1\h+
: one or more horizontal white-spaces (space, non-breaking space, ...)-
: literal hyphen\h+
: one or more horizontal white-spaces (space, non-breaking space, ...)(.+?)
: one or more characters (non-greedy), captured as group 2(?: )?
: optional, non-capturing, group, which has:\h+
: one or more horizontal white-spaces (space, non-breaking space, ...)(?:ft\.)?
: optional, non-capturing literalft.
\h+
: one or more horizontal white-spaces (space, non-breaking space, ...)(.*?)
: zero or more characters (non-greedy), captured as group 3
\h*
: zero or more horizontal white-spaces (space, non-breaking space, ...)(?: )?
: optional, non-capturing, group, which has:\(
: literal(
(.*?)
: zero or more characters (non-greedy), captured as group 4\)
: literal)
\.mp3
: literal.mp3
$
: end of string, so in combination with^
the whole string must match
Used in PHP code, it looks like this:
$songs = array( 'Armin van Buuren - Rain ft. Cathy Burton (Urbanstep Remix).mp3', 'Alpha Drop - Spring Fever.mp3', 'Beatcore - Tonight ft. Lynn Boyer.mp3', 'iru1919 - ??.mp3' ); // Prepare results array $results = array(); // Define key names that will be used in each element $keys = array("artist", "songname", "featuring", "remixBy"); // Iterate over input foreach($songs as $song) { if (preg_match( "/^(.+?)\h+-\h+(.+?)(?:\h+(?:ft\.)?\h+(.*?))?\h*(?:\((.*?)\))?\.mp3$/", $song, $matches)) { // Remove original string (at position 0) array_shift($matches); // Convert matched items (groups) to associative array // and add to result $results[] = array_combine($keys, array_pad($matches, 4, '')); } else { echo "This file name doesn't match the pattern: $song"; }; } // Output results: echo json_encode($results, JSON_PRETTY_PRINT);
Output is:
[ { "artist": "Armin van Buuren", "songname": "Rain", "featuring": "Cathy Burton", "remixBy": "Urbanstep Remix" }, { "artist": "Alpha Drop", "songname": "Spring Fever", "featuring": "", "remixBy": "" }, { "artist": "Beatcore", "songname": "Tonight", "featuring": "Lynn Boyer", "remixBy": "" }, { "artist": "iru1919", "songname": "\u5929\u72d0", "featuring": "", "remixBy": "" } ]
Variation without "Remix"
If you want to have the word "Remix" itself excluded from the results, then you could extend the regular expression to this:
/^(.+?)\h+\-\h+(.+?)(?:\h+(?:ft\.)?\h+(.*?))?\h*(?:\((.*?)(?:\h+Remix)?\))?\.mp3$/
Note the added group:
(?:\h+Remix)?
: a non-capturing, optional group, matching one or more white spaces and the literalRemix
.
With this variation, the output for the first song would have as last key:
"remixBy": "Urbanstep"
Answer by Jan for Get a specific regex using PhP (Song names)
A complete walk-through in PHP
would be:
[^-]+) # capture everything but a dash to group "artist" - (?.*) # capture everything but .mp3 to group "rest" (?:\.mp3) $ ~xm'; # multiline and freespace mode preg_match_all($regex, $string, $matches, PREG_SET_ORDER); foreach ($matches as $match) { $artist = trim($match["artist"]); list($title, $artist2) = preg_split("~ft\.~", $match["rest"]); echo "Artist: " . trim($artist) . ", Title: " . trim($title) . (!empty($artist2)?", Second Artist: $artist2":"") . "\n"; } // output: // Artist: Armin van Buuren, Title: Rain, Second Artist: Cathy Burton (Urbanstep Remix) // Artist: Alpha Drop, Title: Spring Fever // Artist: Beatcore, Title: Tonight, Second Artist: Lynn Boyer // Artist: iru1919, Title: ?? ?>
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment