Tag Archives: parsing

Parsing XML using PHP

Questions: I’ve consistently had an issue with parsing XML with PHP and not really found “the right way” or at least a standardised way of parsing XML files. Firstly i’m trying to parse this: <item> <title>2884400</title> <description><![CDATA[ ><img width=”126″ alt=”” src=”http://userserve-ak.last.fm/serve/126/27319921.jpg” /> ]]></description> <link>http://www.last.fm/music/+noredirect/Beatles/+images/27319921</link> <author>anne710</author> <pubDate>Tue, 21 Apr 2009 16:12:31 +0000</pubDate> <guid>http://www.last.fm/music/+noredirect/Beatles/+images/27319921</guid> <media:content url=”http://userserve-ak.last.fm/serve/_/27319921/Beatles+2884400.jpg” fileSize=”13065″… Read More »

parsing json error : SyntaxError: JSON.parse: unexpected character at line 1 column 2 of the JSON data

Questions: i have a problem when parsing json from php to javascript this is my example code : //function MethodAjax = function (wsFile, param) { return $.ajax({ type: “POST”, dataType: “json”, url: ‘../proses/’ + wsFile + “.proses.php”, data: ‘param=’+param, error: function (msg) { return; }, }); }; //call function $(document).ready(function() { $(‘#getproduk’).click(function(){ var param =… Read More »

Parsing XML with PHP's simpleXML

Questions: I’m learning how to parse XML with PHP’s simple XML. My code is: <?php $xmlSource = “<?xml version=\”1.0\” encoding=\”UTF-8\” standalone=\”no\”?> <Document xmlns=\”http://www.apple.com/itms/\” artistId=\”329313804\” browsePath=\”/36/6407\” genreId=\”6507\”> <iTunes> myApp </iTunes> </Document>”; $xml = new SimpleXMLElement($xmlSource); $results = $xml->xpath(“/Document/iTunes”); foreach ($results as $result){ echo $result.PHP_EOL; } print_r($result); ?> When this runs it returns a blank screen, with… Read More »

Linkedin sharing urls / not parsing open graph

Questions: The Linkedin documentation can be found here As it says, it needs: og:title og:description og:image og:url Here is an example of my wordpress blog source code that for simplicity I use Jetpack plug-in: <!– Jetpack Open Graph Tags –> <meta property=”og:type” content=”article” /> <meta property=”og:title” content=”Starbucks Netherlands Intel” /> <meta property=”og:url” content=”http://lorentzos.com/starbucks-netherlands-intel/” /> <meta… Read More »

Parsing command arguments in PHP

Questions: Is there a native “PHP way” to parse command arguments from a string? For example, given the following string: foo “bar \”baz\”” ‘\’quux\” I’d like to create the following array: array(3) { [0] => string(3) “foo” [1] => string(7) “bar “baz”” [2] => string(6) “‘quux'” } I’ve already tried to leverage token_get_all(), but PHP’s… Read More »

Regular expression for parsing CSV in PHP

Questions: I already managed to split the CSV file using this regex: “/,(?=(?:[^\”]\”[^\”]\”)(?![^\”]\”))/” But I ended up with an array of strings that contain the opening and ending double quotes. Now I need a regex that would strip those strings of the delimiter double quotes. As far as I know the CSV format can encapsulate… Read More »

Is there a way to keep entities intact while parsing html with DomDocument?

Questions: I have this function to ensure every img tag has absolute URL: function absoluteSrc($html, $encoding = ‘utf-8’) { $dom = new DOMDocument(); // Workaround to use proper encoding $prehtml = “<html><head><meta http-equiv=\”Content-Type\” content=\”text/html; charset={$encoding}\”></head><body>”; $posthtml = “</body></html>”; if($dom->loadHTML( $prehtml . trim($html) . $posthtml)){ foreach($dom->getElementsByTagName(‘img’) as $img){ if($img instanceof DOMElement){ $src = $img->getAttribute(‘src’); if( strpos($src,… Read More »

Parsing of badly formatted HTML in PHP

Questions: In my code I convert some styled xls document to html using openoffice. I then parse the tables using xml_parser_create. The problem is that openoffice creates oldschool html with unclosed <BR> and <HR> tags, it doesn’t create doctypes and don’t quote attributes <TABLE WIDTH=4>. The php parsers I know off don’t like this, and… Read More »

Going where PHP parse_url() doesn't – Parsing only the domain

Questions: PHP’s parse_url() has a host field, which includes the full host. I’m looking for the most reliable (and least costly) way to only return the domain and TLD. Given the examples: http://www.google.com/foo, parse_url() returns www.google.com for host http://www.google.co.uk/foo, parse_url() returns www.google.co.uk for host I am looking for only google.com or google.co.uk. I have contemplated… Read More »