Perl

Parse XML Document Using Perl

The XML document is used to store a small amount of data that can be transferred easily from another application. Many modules exist in Perl to parse the content of the XML document such as XML::Simple, XML::Parser, XML::Smart, etc. The XML content from a variable or file can be easily parsed by importing the particular module in the Perl script. You have to install the particular Perl XML module to use in the script. The methods of parsing the XML data using different Perl modules are shown in this tutorial.

Different Examples of Parsing the XML Data

The uses of the XML:Simple and XML::Smart modules of Perl are shown in the examples of this tutorial to parse the XML data.

Example 1: Parse the XML Data Using XML::Simple

Run the following command to install the XML::Simple module to parse the XML data using the Perl script:

$ sudo apt-get install libxml-simple-perl

 

Create a Perl file with the following script that reads the XML data from a variable using the XML:Simple module and print the values using the Perl dump variable. Here, the XMLin() method is used to read the XML data from the variable. Next, the Dumper() function is used to print the output using the dump variable.

#!/local/bin/perl
 
use strict;
use warnings;
use XML::Simple;
use Data::Dumper;
 
#Store XML data in a variable
my $xmldata =
q{<Products>
  <Product>
    <name>Samsung HDD</name>
    <type>HDD</type>
    <size>5TB</size>
    <price>50</price>
  </Product>
  <Product>
    <name>HP Monitor</name>
    <type>Monitor</type>
    <size>15 Inches</size>
    <price>70</price>
  </Product>
</Products>};
 
#Create an object to access the XML data
my $xmlObj = XMLin($xmldata);
#Print the XML data using the dump variable
print Dumper($xmlObj);

 

Output:

The following output appears after executing the script. According to the output, the value of the “name” node is printed directly without the node name. In the XML data, there were two child nodes of the “Product” name. So, the other child nodes of this node are printed in two parts:

Example 2: Parse the XML File Using XML::Simple

In the previous example, the method of parsing the XML data from a variable using the XML::Simple module is shown. But this module can also be used to parse the data from an XML file. Create an XML file named “clients.xml” with the following content. The root element of this XML file is “Company” which has three child nodes with the “client” name. Each “client” node has three child nodes which are “cname”, “address”, and “contact_no”.

<Company>
   <client id="6745">
        <cname>Md. Abdullah</cname>
        <address>78, Mirpur, Dhaka.</address>
        <contact_no>+8801954563423</contact_no>
   </client>
   <client id="8956">
        <cname>Nila Hasan</cname>
        <address>124/A, Dhanmondi, Dhaka</address>
        <contact_no>+8801833654234</contact_no>
   </client>
   <client id="3489">
        <cname>Farhan Hossain</cname>
        <address>67, Jigatola, Dhaka</address>
        <contact_no>+8801699453876</contact_no>
   </client>
</Company>

 

Now, create a Perl file with the following script that reads and print the values of all elements of the “clients.xml” file in tabular form. The XMLin() method is used with two arguments here. The XML filename is set as the first argument of the method that was created earlier. The second argument is used to set the value of “ForceArray” to 1 to read the content of the XML file as an array. Next, the “foreach” loop is used to read each node name and node value of the “clients.xml” file.

#!/usr/bin/env perl
use strict;
use warnings;
use 5.34.0;
use XML::Simple qw(XMLin);

#Create object to read the XML file
my $xml = XMLin(&quot;clients.xml&quot;, ForceArray =&gt; 1);

#Set the title of the output
say &quot;Name\t\tAddress\t\t\tPhone&quot;;

#Parse each node value of the XML file
foreach my $id (keys %{$xml-&gt;{client}})
{
    print $xml-&gt;{client}{$id}{cname}[0], &quot;\t&quot;,
          $xml-&gt;{client}{$id}{address}[0], &quot;\t&quot;,
          $xml-&gt;{client}{$id}{contact_no}[0], &quot;\n&quot;;
}

 

Output:

The following output appears after executing the script. The “clients.xml” file contains three clients information which are printed in tabular format in the output:

Example 3: Parse the XML File Using XML::Smart

The XML::Smart is another useful module of Perl to read the content of the XML data from a variable or file. You have to run the following command to install this module in the Ubuntu operating system before using it in the Perl script:

$ sudo apt install libxml-smart-perl

 

Create a Perl file with the following script that reads the “clients.xml” file and print the content of the first client information from the first client node of the XML file:

#!/usr/bin/env perl
use strict;
use warnings;
use XML::Smart;
use 5.34.0;

#Define the XML filename
my $xmlfile = 'clients.xml';
#Create object to parse the XML data
my $xmlObj = XML::Smart->new($xmlfile);

#Parse the first client's information
say "Client Name: $xmlObj->{Company}{client}{cname}";
say "Client Address: $xmlObj->{Company}{client}{address}";
say "Client Contact No.: $xmlObj->{Company}{client}{contact_no}";

 

Output:

The following output appears after executing the script. According to the output, the name, address, and contact number of the first client are printed:

Conclusion

Perl has many modules to parse the XML data in multiple ways. The uses of XML::Simple and XML::Smart modules are shown in this tutorial using the Perl script.

About the author

Fahmida Yesmin

I am a trainer of web programming courses. I like to write article or tutorial on various IT topics. I have a YouTube channel where many types of tutorials based on Ubuntu, Windows, Word, Excel, WordPress, Magento, Laravel etc. are published: Tutorials4u Help.