php - Apply 2 preg_split with regex to text -
context: have split email several customers’ reservations details received every day, set of rules. example of email:
a n k u n f t 11.08.15 *** neubuchung *** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f882129 dsdsaidsaia f882129 xxxyxyagydaysd sadsdsdsdsadsadadssda sadsdsdsdsadsadadssda **«cut here2»** n k u n f t 18.08.15 *** neubuchung *** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f881554 zxcxzcxcxzccxz f881554 xcvcxvcxvcvxc f881554 xvcxvcxcvxxvccvxxcv **«cut here»** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f881605 xczxcdfsfdsdfs f881605 zxccxzxzdffdsfds **«cut here»**
so has cut whenever last f999999 appears (where 9 can digit), because f999999 reservation code.* inserted text: «cut here» better understand cut.
*note: reservation code may have following formats: f999999, a999999, e999999 or 999999.
so apply working preg_split following regex:
regex1 = "/(?:\\s(f|a|e)?\\d{6}\\s?+.*?\r\n\\s?\r\n)\\k//ms";
however have cut «cut here2» appears, because there text after reservation code delimiter.
so created regex:
regex2 = "/^\h*(f|a|e)?\d{6}.*?\r{2}\k/ms"
yet, have format (newlines between, f999999 of same reservation), making previous regex (regex2) cut says «not cut here»:
a n k u n f t 11.08.15 *** neubuchung *** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f882129 dsdsaidsaia <<not cut here>> f882129 xxxyxyagydaysd sadsdsdsdsadsadadssda sadsdsdsdsadsadadssda **«cut here»** n k u n f t 18.08.15 *** neubuchung *** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f881554 zxcxzcxcxzccxz <<not cut here>> f881554 xcvcxvcxvcvxc f881554 xvcxvcxcvxxvccvxxcv **«cut here»** 11.08.15 xxx xxx x3 2830 14:25 17:50 18.08.15 xxx xxx x3 2831 18:40 f881605 xczxcdfsfdsdfs f881605 zxccxzxzdffdsfds **«cut here»**
i want cut «cut here» appears.
this error happens example:
***neubuchung *** 23.02.17 dus fnc de 1414 12:05 15:10 09.03.17 fnc dus de 1415 16:40 fnc011 enotel baia 9360-215 ponta sol 1 dz typ meerblick 2erw. frühstück 03.10.16 crs: mx - pnr: 1290689 fluggeber: condor flugdienst / pnr: 1290689 frühbucher 10% inkl. reiseleitung und transfer ab/bis a025808 herr berg, ulrich 62 <<not cut here> anfrage. a025808 frau berghaus, petra 58 **«cut here»** ***s t o r n o ** 04.10.16 str x3 2810 11.10.16 fnc str x3 2811 18:15 fnc036 flame tree funchal 1 dz meerblick 2erw. h a987025 frau burg, gertrud *** storno *** o <<not cut here>> a987025 herr burg, walter *** storno *** o **«cut here»** ***Änderung *** neu:01.11.16 fra x3 2806 13:35 16:50 08.11.16 fnc fra x3 2807 17:40 fnc813 golden residence/wanderk. 9000-105 funchal 1 suite seitl. meerblick 3erw. f a982512 frau krost, simone frühbucher 15% <<not cut here>> inkl. reiseleitung und transfer ab/bis im reisepreis bereits enthalten: drei geführte wanderungen (1 ganztags- und 2 halbtagswanderungen) inkl. aller transfers. **«should cut here»** ***Änderung *** alt:01.11.16 fra x3 2806 13:35 16:50 08.11.16 fnc fra x3 2807 17:40 fnc813 golden residence/wanderk. 9000-105 funchal 1 suite seitl. meerblick 3erw. f a982512 herr krost, simone **«cut here»** 25.04.17 drs fnc st 1602 13:25 17:15 09.05.17 fnc drs st 1607 00:00 fnc076 baia azul 9004-530 funchal 1 dz typ meerblick 2erw. halbpension 03.10.16 crs: mx - pnr: 15326821 fluggeber: alltours / pnr: 15326821 inkl. reiseleitung und transfer ab/bis flughafen a025986 herr schulze, steffen 55 a025986 frau schulze, kerstin 54 **«cut here»** ***s t o r n o ** 13.11.16 fra x3 2806 20.11.16 fnc fra x3 2807 17:35 fnc096 pestana village & miramar funchal 1 studio 2erw. h a976918 frau hebing, bettina *** storno *** o <<not cut here>> a976918 herr hebing, ludger *** storno *** o **«cut here»**
i put «not cut here» splits shouldn’t. put: «should cut here» should cut. , put «cut here» cuts correctly.
you may use
'~^\h*f\d{6}.*?\r{2}\k~sm'
see regex demo
details:
^
- start of line\h*
- 0+ horizontal whitespacesf\d{6}
-f
+ 6 digits -.*?
- 0+ chars first\r{2}
- 2 linebreaks\k
- , omit whole match text.
see php demo:
$re = '~^\h*f\d{6}.*?\r{2}\k~ms'; $str = "a n k u n f t 11.08.15\n*** neubuchung ***\n 11.08.15 xxx xxx x3 2830 14:25 17:50\n 18.08.15 xxx xxx x3 2831 18:40\n f882129 dsdsaidsaia\n f882129 xxxyxyagydaysd\nsadsdsdsdsadsadadssda\nsadsdsdsdsadsadadssda\n\na n k u n f t 18.08.15\n*** neubuchung ***\n 11.08.15 xxx xxx x3 2830 14:25 17:50\n 18.08.15 xxx xxx x3 2831 18:40\n f881554 zxcxzcxcxzccxz\n f881554 xcvcxvcxvcvxc\n f881554 xvcxvcxcvxxvccvxxcv\n\n\n11.08.15 xxx xxx x3 2830 14:25 17:50\n 18.08.15 xxx xxx x3 2831 18:40\n f881605 xczxcdfsfdsdfs\n f881605 zxccxzxzdffdsfds\n\n"; print_r(preg_split($re, $str));
Comments
Post a Comment