Working with XML
We will see how to read the XML nodes,its sub nodes and the data.
Let’s start with a practical example. We will take the below XML as input.
<?xml version="1.0" encoding="UTF-8"?> <WorkerDetails> <Worker> <Summary> <Employee_ID>123456</Employee_ID> <Name>Shekhar Sharma (123456)</Name> </Summary> <Eligibility MyValue="">true</Eligibility> <Personal> <Name_Data> <Name_Type MyValue="">Legal</Name_Type> <First_Name MyValue="">Shekhar1</First_Name> <Last_Name MyValue="">Sharma1</Last_Name> </Name_Data> <Name_Data> <Name_Type MyValue="">Preferred</Name_Type> <First_Name MyValue="">Shekhar2</First_Name> <Last_Name MyValue="">Sharma2</Last_Name> </Name_Data> <Gender MyValue="">M</Gender> <Birth_Date MyValue="">1975-11-30</Birth_Date> <Country_of_Birth MyValue="">IND</Country_of_Birth> <City_of_Birth MyValue="">Delhi</City_of_Birth> <Marital_Status MyValue="">Unmarried</Marital_Status> <Nationality MyValue="">IND</Nationality> </Personal> <Address_Data> <Address_Type MyValue="">Temp</Address_Type> <Is_Primary MyValue="">true</Is_Primary> <Municipality MyValue="">Bangalore</Municipality> <Region MyValue="">Bellandur</Region> <Postal_Code MyValue="">560037</Postal_Code> <Country MyValue="">IND</Country> </Address_Data> </Worker> </WorkerDetails>
How to Load the XML file?
We need to create the object for XML parsing.
'argXMLFilePath - Path of the xml file argXMLFilePath = "D:\Demo.xml Set oXMLFile = CreateObject("Msxml2.DOMDocument") oXMLFile.Load(argXMLFilePath) oXMLFile.async = "False"
1. We are creating the object for XML using “Msxml2.DOMDocument”.
2. We are loading it through load method.
3. Async has some specific significance here.
If async is true, then method returns the control to the caller before the download is complete.
If async is flase, then download must be completed before the caller gets the control back.
How to count the particular nodes?
Get the first name text:
Set oXMLFile = CreateObject(“Msxml2.DOMDocument”)
oXMLFile.Load(argXMLFilePath)
oXMLFile.async = “False”
Set nodes=oXMLFile.SelectNodes(“/WorkerDetails/Worker/Personal/Name_Data
/First_Name”)
‘ fetch the text
For i = 0 To (nodes.Length – 1)
FirstName = nodes.item(i).text
MsgBox “FirstName :” & FirstName
Next
One of my friends asked for a help on one of his requirement and the requirement was like fetching all nodes, child of that nodes and their data into one excel file. Functions given below are generic. If you have same kind of requirement,then you just need to provide the required file paths and few arguments, Rest it will take care. Otherwise, you can have some idea from this to fulfil your specific requirement. I am sharing that code, so it will helpful for needed.I will tell you in the end,How to call these functions.
You can use the same XML and the functions to see how it works. You will require 3 functions as given below.
First Function:
'‘*********************************************************************************************** '‘Function Name: fn_ExtractXMLNodes_ToExcel 'Descripton : it will extract all the XMLnodes to Excel file for a particular Employee ID 'List of Variables: - 'Input Parameters: argXMLFilePath : XML file path 'argParentName : Parent Name from where to start loading e.g. Worker in our Demo example 'argExcelPath :Excel workboox path like "D:\Demo.xls" 'argSheetName : Sheet name where to store the nodes like "Sheet1" 'argEmployeeID : Emp ID for which data needs to be loaded 'strFlag - For future use 'Return Value: - Null 'Author: Shekhar Sharma '‘*********************************************************************************************** Function fn_ExtractXMLNodes_ToExcel(argXMLFilePath,argParentName,argExcelPath,argSheetName,argFirstNodeData,strFlag) rv = "true" getChildNode = "Summary" ' You can pick up this data through data table or some environment variable getSubchild = "Employee_ID" ' You can pick up this data through data table or some environment variable Set oXMLFile = CreateObject("Msxml2.DOMDocument") oXMLFile.Load(argXMLFilePath) oXMLFile.async = "False" ReadCounter = 2 Set objParent = oXMLFile.documentElement.childNodes For i = 0 to objParent.length-1 'Loop through the whole document strParentName = objParent.item(i).nodeName If instr(strParentName,argParentName) Then Set objChild = objParent.item(i).childNodes For Icnt = 0 to objChild.length-1 '' Loop through the Childs of expected parent strChildName = objChild.item(Icnt).nodeName 'This is for Worker Employee If instr(strChildName,getChildNode) Then Set objSubChild = objChild.item(Icnt).childNodes For Jcnt = 0 to objSubChild.length-1 '' Loop through the subChilds of expected child strEmployeeID = objSubChild.item(Jcnt).nodeName If instr(strEmployeeID,getSubchild) Then strEmpID = objSubChild.item(Jcnt).text If instr(strEmpID,argFirstNodeData) Then Set objExpectedChild = objParent.item(i).childNodes ''Pass this object into the functio below rv = "False" Exit for End If End If Next End If If rv = "False" Then Exit For End If Next End If If rv = "False" Then Exit For End If Next Call fn_ExtractXMLchildNodes_ToExcel(strExcelPath,strSheetName,objExpectedChild,"") Set oXMLFile = Nothing End Function
Second Function:
'‘*********************************************************************************************** '‘Function Name: fn_ExtractXMLchildNodes_ToExcel 'Descripton : '' it will extract all the XML childnodes to Excel file for a particular Employee ID 'List of Variables: - 'Input Parameters: 'strExcelPath :Excel path 'strSheetName : Sheet name where to store the nodes 'objExpectedChild : Emp ID for which data needs to be loaded 'strFlag - For future use 'Return Value: - Null 'Author: Shekhar Sharma '‘*********************************************************************************************** Function fn_ExtractXMLchildNodes_ToExcel(strExcelPath,strSheetName,objExpectedChild,strFlag) Set oExcel = CreateObject("Excel.Application") oExcel.Visible = "True" Set oWB = oExcel.Workbooks.Open(strExcelPath) Set oSheet = oWB.Worksheets(strSheetName) ParentCounter = 2 For Kcnt = 0 to objExpectedChild.length-1 oSheet.cells(ParentCounter,1) = objExpectedChild.item(Kcnt).nodeName '' Summary node If objExpectedChild.item(Kcnt).hasChildNodes Then '' child node as Emp ID and name flagSetcount = True Set oSubChild = objExpectedChild.item(Kcnt).childNodes For Lcnt = 0 to oSubChild.length -1 If oSubChild.item(Lcnt).hasChildNodes Then Set oSubSuperChild = oSubChild.item(Lcnt).childNodes If oSubSuperChild.length > 1 Then For Ncnt = 0 to oSubSuperChild.length-1 oSheet.cells(ParentCounter,2) = oSubChild.item(Lcnt).nodeName Set objSameChildCount =objExpectedChild.item(Kcnt).selectNodes(oSubChild.item(Lcnt).nodeName) If objSameChildCount.length >1 and flagSetcount = True Then oSheet.cells(ParentCounter,3) = objSameChildCount.length flagSetcount = False End If oSheet.cells(ParentCounter,4) = oSubSuperChild.item(Ncnt).nodeName ParentCounter = ParentCounter + 1 Next Else oSheet.cells(ParentCounter,2) = oSubChild.item(Lcnt).nodeName ParentCounter = ParentCounter + 1 End If Else ParentCounter = ParentCounter + 1 End If Next End If Next oWB.Save oWB.Close oExcel.Quit End Function
Third Function:
'‘*********************************************************************************************** '‘Function Name: fn_Verify_XML_Data_FromExcelSheet 'Descripton : ''For reading the data from XML to excel 'List of Variables: - 'Input Parameters: ' argXMLFilePath - XMLfile path ' ' argEmpID -Uniqe ID(Emp ID etc) 'strExcelPath- Excel Workbook Path 'strSheetName :sheet name of the expected data 'strFlag - For future use ' argParentName - Parent node from where we have picked up the data 'Return Value: Null 'Author: Shekhar Sharma ''‘************************************************************************************************ Function fn_Verify_XML_Data_FromExcelSheet(argXMLFilePath,argEmpID,strExcelPath,strSheetName,strFlag,argParentName) XMLFileName=argXMLFilePath Set oXMLFile = CreateObject("Msxml2.DOMDocument") oXMLFile.Load(XMLFileName) oXMLFile.async = "False" ' Start reading row in Excel Sheet from 2nd position ReadCounter = 2 Set objXpath = oXMLFile.documentElement.selectNodes ("//"&argParentName) 'Remove the unwanted Child and create the object for expected one For Icnt = 0 to objXpath.length- 1 intEmpID = objXpath.item(Icnt).text If Instr(intEmpID,argEmpID) Then Set objExpectedParent = objXpath.item(Icnt) Else oXMLFile.documentElement.removeChild(objXpath.item(Icnt)) End If Next If not isobject(objExpectedParent) Then msgbox "Your reporting function if expected parent not found" End If Set oExcel = CreateObject("Excel.Application") oExcel.Visible = "True" Set oWB = oExcel.Workbooks.Open(strExcelPath) Set oSheet = oWB.Worksheets(strSheetName) For Jcnt = 2 to oSheet.usedRange.Rows.count strParentElement = oSheet.cells(Jcnt,1) If strParentElement <> "" Then PreserveParentVal = strParentElement strChildElement = oSheet.cells(Jcnt,2) intChildCount = oSheet.cells(Jcnt,3) strSubChild = oSheet.cells(Jcnt,4) If intChildCount <> "" and cint(intChildCount) > 1 and strChildElement <> "" Then Set objDataXpath = objExpectedParent.ownerDocument.selectNodes("//"&strParentElement&"/"&strChildElement) If Not(objDataXpath is nothing) Then If objDataXpath.length = cint(intChildCount) Then For Kcnt = 0 to objDataXpath.length-1 For Ecnt = 0 to objDataXpath.item(Kcnt).childNodes.length - 1 Set objSuperChild = objDataXpath.item(Kcnt).childNodes strSubChild = oSheet.cells(Jcnt,4) Set objSubChild = objSuperChild.item(Ecnt) If Not(objSubChild is nothing) Then strActualData = objSubChild.text oSheet.cells(Jcnt,6) = "'"&strActualData ' Save a text in excel Jcnt = Jcnt+1 Else strChildElement = oSheet.cells(Jcnt,2) strParentElement = oSheet.cells(Jcnt,1) strSubChild = oSheet.cells(Jcnt,4) ' Fail if Child Node not Present in XML Jcnt = Jcnt+1 End If Next Next Jcnt = Jcnt-1 Else strObjDescription = strParentElement&"-"&strChildElement&"-"&strSubChild ' Fail if Child Node not Present in XML Jcnt = Jcnt+1 End If Else Exit Function End If ElseIf strChildElement <> "" and strSubChild <> "" Then Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&strParentElement&"/"&strChildElement&"/"&strSubChild) If Not(objSubChild is nothing) Then strActualData = objSubChild.text oSheet.cells(Jcnt,6) = "'"&strActualData Else strObjDescription = strParentElement&"-"&strChildElement&"-"&strSubChild ' Fail if Child Node not Present in XML Jcnt = Jcnt+1 End If ElseIf strChildElement <> "" and strSubChild = "" Then '' If only child item is presetn , not the subchild Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&strParentElement&"/"&strChildElement) If Not(objSubChild is nothing) Then strActualData = objSubChild.text oSheet.cells(Jcnt,6) = "'"&strActualData Else ' Fail if Child Node not Present in XML Jcnt = Jcnt+1 End If End If Else strChildElement = oSheet.cells(Jcnt,2) ''' This else part is used if the parent is empty. It will use the preserved parent value which was preserved previously intChildCount = oSheet.cells(Jcnt,3) strSubChild = oSheet.cells(Jcnt,4) If intChildCount <> "" and cint(intChildCount) > 1 and strChildElement <> "" Then Set objDataXpath = objExpectedParent.ownerDocument.selectNodes("//"&PreserveParentVal&"/"&strChildElement) If Not(objDataXpath is nothing) Then If objDataXpath.length = cint(intChildCount) Then For Kcnt = 0 to objDataXpath.length-1 For Ecnt = 0 to objDataXpath.item(Kcnt).childNodes.length - 1 Set objSuperChild = objDataXpath.item(Kcnt).childNodes strSubChild = oSheet.cells(Jcnt,4) Set objSubChild = objSuperChild.item(Ecnt) If Not(objSubChild is nothing) Then strActualData = objSubChild.text oSheet.cells(Jcnt,6) = "'"&strActualData Jcnt = Jcnt+1 Else strObjDescription = strParentElement&"-"&strChildElement ' Fail if Child Node not Present in XML - your reportig function Jcnt = Jcnt+1 End If Next Next Jcnt = Jcnt-1 Else ' Fail if Child Node not Present in XML - your reportig function Jcnt = Jcnt+1 End If Else Exit Function End If ElseIf strChildElement <> "" and strSubChild <> "" Then Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&PreserveParentVal&"/"&strChildElement&"/"&strSubChild) If Not(objSubChild is nothing) Then strActualData = objSubChild.text strExpectedData = oSheet.cells(Jcnt,5) ' To get the data column value Else ' Fail if Child Node not Present in XML- your reportig function Jcnt = Jcnt+1 End If ElseIf strChildElement <> "" and strSubChild = "" Then '' If only child item is presetn , not the subchild Set objSubChild = objExpectedParent.ownerDocument.selectSingleNode("//"&PreserveParentVal&"/"&strChildElement) If Not(objSubChild is nothing) Then strActualData = objSubChild.text oSheet.cells(Jcnt,6) = "'"&strActualData Else ' Fail if Child Node not Present in XML- Your reporting function Jcnt = Jcnt+1 End If End If End If Next oWB.Save oWB.Close oExcel.Quit End Function
You can call these functions as shown below.
argXMLFilePath = "D:\Demo.xml" argParentName = "Worker" strExcelPath = "D:\WorkdayIntegration\NGA\Demo.xls" strSheetName = "Sheet1" argEmpID = "123456" Call fn_ExtractXMLNodes_ToExcel(argXMLFilePath,argParentName,strExcelPath,strSheetName,argEmpID,"") Call fn_Verify_XML_Data_FromExcelSheet(argXMLFilePath,argEmpID,strExcelPath,strSheetName,"",argParentName)
It will write the data from second row in the excel sheet. The columns in excel are explained below
Column 1: Parent node
Column 2: Child Node
Column 3: no. of time the subchilds are repeated for its parent
Column 4: Subchild, if any
Column 6: Text data of nodes
You can provide column header as per your need. The excel will look like this as given below