Page 1 of 1

Fetching title from multiple webpages using VBSCRIPT

Posted: Wed Mar 06, 2013 12:06 am
by Suresh23
Hi folks,

I am just writing a vbscript which picks multiple links and loads it in browser and then fetches the title of the webpage. I have written a code that searches the given links in website but need help to fetch the title of multiple page. Here is the script

/////

Set ie = createobject("InternetExplorer.application")
dim url(1)
const navOpenInNewTab = &h0800
url(0) = "http://www.sapien.com/software/primalscript"
'title - PrimalScript 2012
url(1) = "http://www.sapien.com/software/powershell_studio"
'title - PowerShell Studio 2012
for i = 0 to 1
ie.navigate url(i) , CLng(navOpenInNewTab)
ie.visible = "True"
while ie.Busy
wscript.sleep 50
Wend
next

//////

Can anyone help to add some script so that i can fetch the titles of the webpage and save it in some file along with links ?? Thanks!!

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Wed Mar 06, 2013 2:20 am
by jvierra
This is very complex to do with VBScript. It is much easier if you use the XMLHTTP client object and even easier in PoweerShell. It stilltakes a very good and completer knowledge of HTML page structures to parse a page.

Search the web for "web scraping" and you should find examples of code for doing this.

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Wed Mar 06, 2013 10:50 pm
by Suresh23
Thanks for ur feedback Jvierra.. Appreciate it. I believe this method

"set t = IE.document.getElementsbyTagname("<h1>")"

might be helpful to scrap the data.. But I'm just a beginner so thinkin to script around to make it work:)

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Wed Mar 06, 2013 11:24 pm
by jvierra
The page title is in <title> and it is optional. <h1> can exist many times and can also be missing.

IE exposes a window title.

This is why I posted that you need to know html and page design in order to screen scrape effectively.

Your request is far to general. Create as script and ask a specific question. There are many web resources for learning how to writ and use html.

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Sun Oct 06, 2013 4:42 pm
by hackoo
Hi ;)
Try this code :
VBScript Code
Double-click the code block to select all.
Option Explicit
Dim URL,fso,ws,LogFile
Set fso = CreateObject("Scripting.FileSystemObject")
Set ws = CreateObject("Wscript.Shell")
LogFile = Left(Wscript.ScriptFullName,InstrRev(Wscript.ScriptFullName, ".")) & "txt"
if fso.FileExists(LogFile) Then 
	fso.DeleteFile LogFile
end If
URL = inputbox("Type into the Box the URL that you want to extract its title :",_
"Extract Title from web page","http://www.sapien.com/software/powershell_studio")

Call ExtractTitleFrom(URL)

Sub ExtractTitleFrom(URL)
Dim Title,ie,Ws,Question,Data,objRegex,Match,Matches,i
    Title = "Extract Title from "& DblQuote(URL)
    Set ie = CreateObject("InternetExplorer.Application")
    Set Ws = CreateObject("wscript.Shell")
    ie.Navigate(URL)
    ie.Visible=false
    DO WHILE ie.busy
        wscript.sleep 100
    LOOP
    Data = ie.document.documentElement.innerHTML
    ie.Quit
    Set ie = Nothing
    WriteLog String(35,"*") & Now & String(35,"*") & vbCrLf &_
    RegExp("<title>(.*)</title>",Data) & vbCrLf & String(89,"*"),LogFile
    Question = MsgBox(RegExp("<title>(.*)</title>",Data)& vbcr & vbcr &_
    "Do you want to access this site : "& URL &" ?",VBYesNO+VbQuestion,Title)
    If Question = VbYes then
        Ws.Run URL,1,False
        ws.Run LogFile,1,False
    Else
        ws.Run LogFile,1,False
        Wscript.Quit
    end if
End Sub

Function RegExp(Pattern,Data) 
Dim objRegex,Matches,Match,i
 Set objRegex = new RegExp
 objRegex.Pattern = Pattern  
 objRegex.Global = False 'une seule instance  
 objRegex.IgnoreCase = True 'Ignorer la casse  
 Set Matches = objRegex.Execute(Data)  
 If Matches.Count > 0 Then
      Set Match = Matches(0)
   If Match.SubMatches.Count > 0 Then
        For i = 0 To Match.SubMatches.Count-1
            RegExp = Match.SubMatches(i)
        Next
    End If
 End If
 Set Matches = Nothing  
 Set objRegex = Nothing  
End Function

Function DblQuote(Str)
	DblQuote = Chr(34) & Str & Chr(34)
End Function

Sub WriteLog(strText,LogFile)
	Dim fs,ts 
	Const ForAppending = 8
	Set fs = CreateObject("Scripting.FileSystemObject")
	Set ts = fs.OpenTextFile(LogFile,ForAppending,True)
	ts.WriteLine strText
	ts.Close
End Sub

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Sun Oct 06, 2013 5:33 pm
by jvierra
To get title and links from page:
PowerShell Code
Double-click the code block to select all.
$ie = new-object -ComObject InternetExplorer.Application
$ie.Navigate2('http://www.google.com')
while($ie.Busy){}
$ie.Document.title
$ie.Document.links|select href
It works the same in VBScript.

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Sun Oct 06, 2013 6:19 pm
by jvierra
I know - you don't believe me. Well - here it is.
VBScript Code
Double-click the code block to select all.
Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate2 "http://www.google.com"
While ie.Busy
Wend
WScript.Echo ie.Document.title
For Each link in ie.Document.links
    WScript.Echo link.href
Next

Re: Fetching title from multiple webpages using VBSCRIPT

Posted: Sun Oct 06, 2013 6:27 pm
by jvierra
Note that my original answer to this was way off for some reason. My brain must have been out of gear when I answered the OP's question.