Working with XML

We will see how to read the XML nodes,its sub nodes and the data.
Let’s start with a practical example. We will take the below XML as input.

<?xml version="1.0" encoding="UTF-8"?>
<WorkerDetails>
<Worker>
<Summary>
	<Employee_ID>123456</Employee_ID>
	<Name>Shekhar Sharma (123456)</Name>
</Summary>
	<Eligibility MyValue="">true</Eligibility>
<Personal>
<Name_Data>
	<Name_Type MyValue="">Legal</Name_Type>
	<First_Name MyValue="">Shekhar1</First_Name>
	<Last_Name MyValue="">Sharma1</Last_Name>
</Name_Data>
<Name_Data>
	<Name_Type MyValue="">Preferred</Name_Type>
	<First_Name MyValue="">Shekhar2</First_Name>
	<Last_Name MyValue="">Sharma2</Last_Name>
</Name_Data>
	<Gender MyValue="">M</Gender>
	<Birth_Date MyValue="">1975-11-30</Birth_Date>
	<Country_of_Birth MyValue="">IND</Country_of_Birth>
	<City_of_Birth MyValue="">Delhi</City_of_Birth>
	<Marital_Status MyValue="">Unmarried</Marital_Status>
	<Nationality MyValue="">IND</Nationality>
</Personal>
<Address_Data>
	<Address_Type MyValue="">Temp</Address_Type>
	<Is_Primary MyValue="">true</Is_Primary>
	<Municipality MyValue="">Bangalore</Municipality>
	<Region MyValue="">Bellandur</Region>
	<Postal_Code MyValue="">560037</Postal_Code>
	<Country MyValue="">IND</Country>
</Address_Data>
</Worker>
</WorkerDetails>

How to Load the XML file?

We need to create the object for XML parsing.

'argXMLFilePath   -  Path of the xml file
argXMLFilePath = "D:\Demo.xml
Set oXMLFile = CreateObject("Msxml2.DOMDocument")
oXMLFile.Load(argXMLFilePath)
oXMLFile.async = "False"

1. We are creating the object for XML using “Msxml2.DOMDocument”.
2. We are loading it through load method.
3. Async has some specific significance here.

If async is true, then method returns the control to the caller before the download is complete.
If async is flase, then download must be completed before the caller gets the control back.


How to count the particular nodes?

Count Node
Specific Node count

Specific Node count


Get the first name text:

Get First Name Text
argXMLFilePath = “D:\Demo.xml”
Set oXMLFile = CreateObject(“Msxml2.DOMDocument”)
oXMLFile.Load(argXMLFilePath)
oXMLFile.async “False”
Set nodes=oXMLFile.SelectNodes(“/WorkerDetails/Worker/Personal/Name_Data
/First_Name”
)
‘ fetch the text
For i = 0 To (nodes.Length – 1)
FirstName = nodes.item(i).text
MsgBox “FirstName :” & FirstName
Next
Names


One of my friends asked for a help on one of his requirement and the requirement was like fetching all nodes, child of that nodes and their data into one excel file. Functions given below are generic. If you have same kind of requirement,then you just need to provide the required file paths and few arguments, Rest it will take care. Otherwise, you can have some idea from this to fulfil your specific requirement. I am sharing that code, so it will helpful for needed.I will tell you in the end,How to call these functions.

You can use the same XML and the functions to see how it works. You will require 3 functions as given below.

First Function:

'‘***********************************************************************************************
'‘Function Name:             fn_ExtractXMLNodes_ToExcel
'Descripton :                it will extract all the XMLnodes to Excel file for a particular Employee ID
'List of  Variables:            -
'Input Parameters:           argXMLFilePath :  XML file path 
							'argParentName : Parent Name from where to start loading e.g. Worker in our Demo example
							'argExcelPath :Excel workboox path  like "D:\Demo.xls" 
							'argSheetName : Sheet name where to store the nodes  like "Sheet1"
							'argEmployeeID : Emp ID  for which data needs to be loaded  
							'strFlag - For future use
'Return Value:                   - Null
'Author:                                Shekhar Sharma
  
'‘***********************************************************************************************
Function fn_ExtractXMLNodes_ToExcel(argXMLFilePath,argParentName,argExcelPath,argSheetName,argFirstNodeData,strFlag)

		rv = "true"
		getChildNode = "Summary"  ' You can pick up this data through data table or some environment variable
		getSubchild = "Employee_ID"  ' You can pick up this data through data table or some environment variable
		Set oXMLFile = CreateObject("Msxml2.DOMDocument") 
		oXMLFile.Load(argXMLFilePath)
		oXMLFile.async = "False"
		ReadCounter = 2
	
		Set objParent = oXMLFile.documentElement.childNodes
	

			For i = 0 to objParent.length-1					'Loop through the whole document
				strParentName =  objParent.item(i).nodeName
				If instr(strParentName,argParentName) Then
					Set objChild = objParent.item(i).childNodes
						For Icnt = 0 to objChild.length-1			'' Loop through the Childs of expected parent
							strChildName =  objChild.item(Icnt).nodeName

							'This is for Worker Employee
							If instr(strChildName,getChildNode) Then
									Set objSubChild = objChild.item(Icnt).childNodes
									For Jcnt  = 0 to objSubChild.length-1		'' Loop through the subChilds of expected child
											strEmployeeID =  objSubChild.item(Jcnt).nodeName
											If instr(strEmployeeID,getSubchild) Then
													strEmpID =  objSubChild.item(Jcnt).text
													If instr(strEmpID,argFirstNodeData) Then
														Set objExpectedChild = objParent.item(i).childNodes		''Pass this object into the functio below
														rv = "False"
														Exit for
													End If
											End If
									Next
							End If
							
									If rv = "False" Then
										Exit For
									End If
						Next
				End If
									If rv = "False" Then
										Exit For
									End If
			Next

		Call  fn_ExtractXMLchildNodes_ToExcel(strExcelPath,strSheetName,objExpectedChild,"")	
			Set oXMLFile = Nothing
End Function

Second Function:

'‘***********************************************************************************************
'‘Function Name:             fn_ExtractXMLchildNodes_ToExcel
'Descripton :                    '' it will extract all the XML  childnodes  to Excel file for a particular Employee ID
'List of  Variables:            -
'Input Parameters:          
							'strExcelPath :Excel path 
							'strSheetName : Sheet name where to store the nodes
							'objExpectedChild : Emp ID  for which data needs to be loaded
							'strFlag - For future use
'Return Value:                   - Null
'Author:                                Shekhar Sharma
 
'‘***********************************************************************************************

	Function fn_ExtractXMLchildNodes_ToExcel(strExcelPath,strSheetName,objExpectedChild,strFlag)
		Set oExcel = CreateObject("Excel.Application")
		oExcel.Visible = "True"
		Set oWB = oExcel.Workbooks.Open(strExcelPath)
		Set oSheet = oWB.Worksheets(strSheetName)
		ParentCounter = 2
	
		For Kcnt = 0 to objExpectedChild.length-1
					oSheet.cells(ParentCounter,1) =  objExpectedChild.item(Kcnt).nodeName  '' Summary node
					If objExpectedChild.item(Kcnt).hasChildNodes Then   			'' child node as Emp ID  and name 
						flagSetcount = True
						Set oSubChild = objExpectedChild.item(Kcnt).childNodes			
						For Lcnt = 0 to oSubChild.length -1
							If oSubChild.item(Lcnt).hasChildNodes Then
								Set oSubSuperChild = oSubChild.item(Lcnt).childNodes
									If oSubSuperChild.length > 1 Then
										For Ncnt = 0 to oSubSuperChild.length-1
											oSheet.cells(ParentCounter,2) = oSubChild.item(Lcnt).nodeName
											Set objSameChildCount =objExpectedChild.item(Kcnt).selectNodes(oSubChild.item(Lcnt).nodeName)
											If objSameChildCount.length >1 and flagSetcount = True Then
												oSheet.cells(ParentCounter,3) = objSameChildCount.length
												flagSetcount = False
											End If
											oSheet.cells(ParentCounter,4) = oSubSuperChild.item(Ncnt).nodeName
											ParentCounter = ParentCounter + 1
										Next
									Else
										oSheet.cells(ParentCounter,2) = oSubChild.item(Lcnt).nodeName
										ParentCounter = ParentCounter + 1
									End If
							Else
									ParentCounter = ParentCounter + 1
							End If
						Next
	
					End If
			Next
	oWB.Save
	oWB.Close
	oExcel.Quit
	End Function

Third Function:

																																																
'‘***********************************************************************************************
'‘Function Name:             fn_Verify_XML_Data_FromExcelSheet
'Descripton :                    ''For reading the data from XML to excel
'List of  Variables:            -
'Input Parameters:        '  argXMLFilePath - XMLfile path 
'                         ' argEmpID -Uniqe ID(Emp ID etc)
												
                          'strExcelPath- Excel Workbook Path
			 'strSheetName :sheet name of the expected data
			  'strFlag - For future use
			' argParentName - Parent node from where we have picked up the data
'Return Value:           Null
'Author:                 Shekhar Sharma
 
''‘************************************************************************************************
Function fn_Verify_XML_Data_FromExcelSheet(argXMLFilePath,argEmpID,strExcelPath,strSheetName,strFlag,argParentName)
	 
XMLFileName=argXMLFilePath
Set oXMLFile = CreateObject("Msxml2.DOMDocument") 
oXMLFile.Load(XMLFileName)
oXMLFile.async = "False"
' Start reading row in Excel Sheet from 2nd position
ReadCounter = 2		

Set objXpath =  oXMLFile.documentElement.selectNodes ("//"&argParentName)

'Remove the unwanted Child and create the object for expected one
For Icnt = 0 to objXpath.length- 1
	intEmpID =  objXpath.item(Icnt).text
	If Instr(intEmpID,argEmpID) Then
	 Set objExpectedParent = objXpath.item(Icnt)
	Else
	 oXMLFile.documentElement.removeChild(objXpath.item(Icnt))
	End If
 Next
If not  isobject(objExpectedParent)  Then
	msgbox "Your reporting function if expected parent not found"
End If
Set oExcel = CreateObject("Excel.Application")
oExcel.Visible = "True"
Set  oWB = oExcel.Workbooks.Open(strExcelPath)
Set oSheet = oWB.Worksheets(strSheetName)
	For Jcnt = 2 to oSheet.usedRange.Rows.count
		strParentElement = oSheet.cells(Jcnt,1)
		If strParentElement <> "" Then
		PreserveParentVal = strParentElement
		strChildElement = oSheet.cells(Jcnt,2)
		intChildCount  = oSheet.cells(Jcnt,3)
		strSubChild  = oSheet.cells(Jcnt,4)
	If intChildCount <> "" and cint(intChildCount) > 1 and strChildElement <> "" Then
		Set objDataXpath =  objExpectedParent.ownerDocument.selectNodes("//"&strParentElement&"/"&strChildElement)
		If Not(objDataXpath is nothing) Then
			If  objDataXpath.length = cint(intChildCount) Then
					For Kcnt = 0 to objDataXpath.length-1
							For Ecnt = 0 to objDataXpath.item(Kcnt).childNodes.length - 1
								Set objSuperChild = objDataXpath.item(Kcnt).childNodes
								strSubChild  = oSheet.cells(Jcnt,4)
								Set objSubChild = objSuperChild.item(Ecnt)
								If Not(objSubChild is nothing) Then
									strActualData = objSubChild.text
								    oSheet.cells(Jcnt,6) = "'"&strActualData			' Save a text in excel
									Jcnt = Jcnt+1
								Else
									 strChildElement = oSheet.cells(Jcnt,2)
										 strParentElement  = oSheet.cells(Jcnt,1)
										strSubChild  = oSheet.cells(Jcnt,4)
									' Fail if Child Node not Present in XML
									Jcnt = Jcnt+1
								End If
							Next
					Next
					 Jcnt = Jcnt-1
				Else
					strObjDescription = strParentElement&"-"&strChildElement&"-"&strSubChild
					' Fail if Child Node not Present in XML
					Jcnt = Jcnt+1
		End If
			Else
			Exit Function
		End If
	ElseIf strChildElement <> "" and strSubChild <> "" Then 
				Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&strParentElement&"/"&strChildElement&"/"&strSubChild)
				If Not(objSubChild is nothing) Then
						strActualData = objSubChild.text
						oSheet.cells(Jcnt,6) =  "'"&strActualData
				Else
					strObjDescription = strParentElement&"-"&strChildElement&"-"&strSubChild
					' Fail if Child Node not Present in XML
					Jcnt = Jcnt+1
				End If
	ElseIf strChildElement <> "" and strSubChild =  "" Then 		'' If only child item is presetn , not the subchild
				Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&strParentElement&"/"&strChildElement)
				If Not(objSubChild is nothing) Then
						strActualData = objSubChild.text
						oSheet.cells(Jcnt,6) =  "'"&strActualData
				Else
					' Fail if Child Node not Present in XML
					Jcnt = Jcnt+1
				End If
	End If
	Else
	strChildElement = oSheet.cells(Jcnt,2)				''' This else part is used if the parent is empty. It will use the preserved parent value which was preserved previously
	intChildCount  = oSheet.cells(Jcnt,3)
	strSubChild  = oSheet.cells(Jcnt,4)
	If intChildCount <> "" and cint(intChildCount) > 1 and strChildElement <> "" Then
		Set objDataXpath =  objExpectedParent.ownerDocument.selectNodes("//"&PreserveParentVal&"/"&strChildElement)
		If Not(objDataXpath is nothing) Then
			If  objDataXpath.length = cint(intChildCount) Then
				For Kcnt = 0 to objDataXpath.length-1
					For Ecnt = 0 to objDataXpath.item(Kcnt).childNodes.length - 1
						Set objSuperChild = objDataXpath.item(Kcnt).childNodes
						strSubChild  = oSheet.cells(Jcnt,4)
						Set objSubChild = objSuperChild.item(Ecnt)
						If Not(objSubChild is nothing) Then
							strActualData = objSubChild.text
							oSheet.cells(Jcnt,6) =  "'"&strActualData
							Jcnt = Jcnt+1
						Else
							strObjDescription = strParentElement&"-"&strChildElement
							' Fail if Child Node not Present in XML - your reportig function
							Jcnt = Jcnt+1
						End If
					Next
			Next
			 Jcnt = Jcnt-1
			Else
			' Fail if Child Node not Present in XML - your reportig function
			Jcnt = Jcnt+1
		End If
			Else
			Exit Function
		End If
	ElseIf strChildElement <> "" and strSubChild <> "" Then 
		Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&PreserveParentVal&"/"&strChildElement&"/"&strSubChild)
		If Not(objSubChild is nothing) Then
				strActualData = objSubChild.text
				strExpectedData = oSheet.cells(Jcnt,5)		' To get the data column value
			Else
			' Fail if Child Node not Present in XML- your reportig function
			Jcnt = Jcnt+1
		End If
	ElseIf strChildElement <> "" and strSubChild =  "" Then 		'' If only child item is presetn , not the subchild
		Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&PreserveParentVal&"/"&strChildElement)
		If Not(objSubChild is nothing) Then
			strActualData = objSubChild.text
			oSheet.cells(Jcnt,6) =  "'"&strActualData
		Else
			' Fail if Child Node not Present in XML- Your reporting function 
			Jcnt = Jcnt+1
		End If
	End If
	End If
	Next
	oWB.Save
	oWB.Close
	oExcel.Quit
End Function

You can call these functions as shown below.

argXMLFilePath = "D:\Demo.xml"
argParentName = "Worker"
strExcelPath = "D:\WorkdayIntegration\NGA\Demo.xls"
strSheetName = "Sheet1"
argEmpID = "123456"
Call  fn_ExtractXMLNodes_ToExcel(argXMLFilePath,argParentName,strExcelPath,strSheetName,argEmpID,"")		
Call  fn_Verify_XML_Data_FromExcelSheet(argXMLFilePath,argEmpID,strExcelPath,strSheetName,"",argParentName)

It will write the data from second row in the excel sheet. The columns in excel are explained below
Column 1: Parent node
Column 2: Child Node
Column 3: no. of time the subchilds are repeated for its parent
Column 4: Subchild, if any
Column 6: Text data of nodes

You can provide column header as per your need. The excel will look like this as given below

ExcelData

 

Avatar photo

Shekhar Sharma

Shekhar Sharma is founder of testingpool.com. This website is his window to the world. He believes that ,"Knowledge increases by sharing but not by saving".

You may also like...