dcf.tasks
Class HTMLCrawlerTask

dcf.tasks.HTMLCrawlerTask
All Implemented Interfaces:
Task

public class HTMLCrawlerTask
implements Task

Implements a distributed web crawler. The task is a Vector of URLs to visit. It is a circular task so when it is done new tasks are created with the links found by the solver.

Author:
Tal Salmona

Constructor Summary
HTMLCrawlerTask()
           
 
Method Summary
 void addMessage(dcf.tasks.String str)
          Post a message on to the parent
 void addResult(java.lang.Object obj)
           
 void divide()
          Divide the task into parts - if the data is a List it is already divided.
 java.util.Vector generate()
          Generates a new HTMLCrawlerTask using the new links we've found
 java.lang.Object getData()
           
 dcf.tasks.String getName()
           
 java.lang.Object getResults()
           
 int getSize()
           
 boolean hasMore()
           
 boolean isCircular()
          The Distributer will call this method when a Task is returned by the client.
If the task is circualr the client will call the generate() method to create additinal tasks.
 java.lang.Object next()
           
 void processResults()
          This is where we save whatever we want to save on the server side.
 void setData(java.lang.Object obj)
           
 void setName(dcf.tasks.String n)
           
 void setParent(Worker w)
          Sets this workers parent
 void setProgress(float i)
          Sets the task progress (used by the Solver to set it to the Worker)
 
Methods inherited from interface dcf.server.Task
setName
 

Constructor Detail

HTMLCrawlerTask

public HTMLCrawlerTask()
Method Detail

setData

public void setData(java.lang.Object obj)
Specified by:
setData in interface Task
Following copied from interface: dcf.server.Task
Parameters:
obj - Most commonly a Vector of data to be processed.

getData

public java.lang.Object getData()
Specified by:
getData in interface Task

divide

public void divide()
Description copied from interface: Task
Divide the task into parts - if the data is a List it is already divided.
Specified by:
divide in interface Task

hasMore

public boolean hasMore()
Specified by:
hasMore in interface Task

next

public java.lang.Object next()
Specified by:
next in interface Task

getSize

public int getSize()
Specified by:
getSize in interface Task

addResult

public void addResult(java.lang.Object obj)
Specified by:
addResult in interface Task
Following copied from interface: dcf.server.Task
Parameters:
obj - An Object to add to the results.

getResults

public java.lang.Object getResults()
Specified by:
getResults in interface Task

setName

public void setName(dcf.tasks.String n)

getName

public dcf.tasks.String getName()
Specified by:
getName in interface Task

isCircular

public boolean isCircular()
Description copied from interface: Task
The Distributer will call this method when a Task is returned by the client.
If the task is circualr the client will call the generate() method to create additinal tasks.
Specified by:
isCircular in interface Task
Following copied from interface: dcf.server.Task
Returns:
true - If the task should create new tasks. false - If the task ends when returned.

generate

public java.util.Vector generate()
Generates a new HTMLCrawlerTask using the new links we've found
Specified by:
generate in interface Task
Returns:
A Vector of new tasks.

processResults

public void processResults()
This is where we save whatever we want to save on the server side.
Specified by:
processResults in interface Task

setParent

public void setParent(Worker w)
Sets this workers parent
Specified by:
setParent in interface Task
Parameters:
w - The parent worker.

addMessage

public void addMessage(dcf.tasks.String str)
Post a message on to the parent
Parameters:
str - The message to post.

setProgress

public void setProgress(float i)
Sets the task progress (used by the Solver to set it to the Worker)
Parameters:
i - The progress for this task.




Distributed Computation Framework