The definition file job.xml tends to grow as you scrape more and more data sets or pages. This chapter explains how to split the definitions into multiple files for easy maintenance.

Gotz allows you to split the definitions either on job types or on model types.

 
 

Split on Job Types

The examples so far scraped BS, PL, Quote data - price and snapshot - of Acme. We can split the definitions into three job types quote, bs and pl. The Example-13 splits job.xml into quote.xml, bs.xml and pl.xml. The bean.xml merges these files and creates the effective job definition. The bean.xml is as below

<gotz xmlns="http://codetab.org/gotz">
    <bean name="task quote" xmlFile="quote.xml" />
    <bean name="task bs" xmlFile="bs.xml" />
    <bean name="task pl" xmlFile="pl.xml" />
</gotz>

When we split on job types, the broad outline of each definition file is as below

<gotz>

    <locators />               <!-- one or more -->

    <fields>
        <steps />            
        <tasks>                <!-- one or more -->
            <task>             <!-- one or more -->     
               <steps>
                  <step />     <!-- optional local steps -->     
               </steps>
            </task>
        </tasks>
    </fields>

    <dataDef>                  <!-- one or more -->

</gotz>

Split on Model

Another way to split is based Gotz model objects. The Locators, DataDef and Fields are model objects. The Example-14 shows how to split on them. The bean.xml is as follows

<gotz xmlns="http://codetab.org/gotz">
    <bean name="locator" xmlFile="locator.xml" />
    <bean name="task" xmlFile="task.xml" />
    <bean name="datadef" xmlFile="datadef.xml" />
</gotz>

The outline of files are as below.

 
 

locator.xml

<gotz>

    <locators />               <!-- one or more -->

</gotz>

task.xml

<gotz>

    <fields>
        <steps />            
        <tasks>                <!-- one or more -->
            <task>             <!-- one or more -->     
               <steps>
                  <step />     <!-- optional local steps -->     
               </steps>
            </task>
        </tasks>
    </fields>

</gotz>

datadef.xml

<gotz>

  <dataDef />               <!-- one or more -->

</gotz>

In the next chapter we explain the logging framework of Gotz.