How to read all text of a web page and check the Spelling

370 views
Skip to first unread message

Deepa

unread,
Sep 14, 2010, 8:35:38 AM9/14/10
to Selenium Users
Hi Friends,
I want to read all the existing text of the given URL and check
whether spellings of the text are proper. It also must find the words
whose spellings are not proper.
Is there any way to do this.

Help from any body would be greatly appreciated.
Plz........


Thanks & Regards
Deepa

Niraj Kumar

unread,
Sep 14, 2010, 1:49:06 PM9/14/10
to seleniu...@googlegroups.com

try like this



var str = "http://test.com/test1/test2/test3/test4";

var SplitResult = str.split("/");

System.out.println(" The first element is " + SplitResult[0]);
System.out.println(" The second element is  " + SplitResult[1]);
System.out.println(" The third element is  " + SplitResult[3]);
System.out.println(" The forth element is  " + SplitResult[4]);
System.out.println(" The fifth element is  " + SplitResult[5]);

hope this would help or let me know

--
You received this message because you are subscribed to the Google Groups "Selenium Users" group.
To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.




--
Thanks & Regard,
Niraj Kumar

Deepa Kiran Bagepalli

unread,
Sep 15, 2010, 12:39:22 AM9/15/10
to seleniu...@googlegroups.com
Thanks for ur reply Niraj,
What i am looking for is , I want to read the content of the page. (ie) When we opened the given URL, i want to read the whole text on that web page.
Please suggest any solution of doing this

Thanks & Reply
Deepa

Aniket Deshpande

unread,
Sep 15, 2010, 12:57:43 AM9/15/10
to seleniu...@googlegroups.com
Hmmm... Interesting...
A couple of solutions spring to mind:
1) You could use the Google Language API Spell Checker... The problem here is that you will have to send a few words at a time to the API, so I don't know how feasible it would be... 
2) You could use the Microsoft Office Libraries and use the MSWord library to do your spell check... (You will have to use Selenium with C# for this)...

Lemme know what you think

- ANIKET

Deepa Kiran Bagepalli

unread,
Sep 15, 2010, 1:04:26 AM9/15/10
to seleniu...@googlegroups.com
Aniket, Thanks a tonne for ur reply.
My first question is - How can i get all the text of the web page.
Secondly - as per ur thoughts , 
-->Does Selenium support MS Word if I use C#, What r that extra facilities C# provides. (Pls let me know as I donot have any idea about using C# in selenium. I currently use Java)

Thanks & Regards
Deepa 

Aniket Deshpande

unread,
Sep 15, 2010, 1:18:45 AM9/15/10
to seleniu...@googlegroups.com
first question's answer: selenium.GetText("<top div id>"). You can use the id of your top div or you could use XPath '\body' to get all the text on the webpage.
second question's answer: You can use the MS Office Interop assemblies for Word. You can download them here:
http://support.microsoft.com/kb/328912
You can then reference the Word assembly dll in your C# project and use them...  

Lemme know if you have more questions...

- ANIKET

Bindu Laxminarayan

unread,
Sep 17, 2010, 4:38:50 PM9/17/10
to Selenium Users
As ankit mentioned, you can get the text of the web page with getText
function.

You can use selenium.getText("//html");
This will return the whole web page text. Then you can use some spell
checker if you know .

Thanks
Bindu Laxminarayan

On Sep 14, 11:18 pm, Aniket Deshpande <meetani...@gmail.com> wrote:
> first question's answer: selenium.GetText("<top div id>"). You can use the
> id of your top div or you could use XPath '\body' to get all the text on the
> webpage.
> second question's answer: You can use the MS Office Interop assemblies for
> Word. You can download them here:http://support.microsoft.com/kb/328912
> You can then reference the Word assembly dll in your C# project and use
> them...
>
> Lemme know if you have more questions...
>
> - ANIKET
>
> On 15 September 2010 10:34, Deepa Kiran Bagepalli <deepa.kiran1...@gmail.com
>
> > wrote:
> > Aniket, Thanks a tonne for ur reply.
> > My first question is - How can i get all the text of the web page.
> > Secondly - as per ur thoughts ,
> > -->Does Selenium support MS Word if I use C#, What r that extra facilities
> > C# provides. (Pls let me know as I donot have any idea about using C# in
> > selenium. I currently use Java)
>
> > Thanks & Regards
> > Deepa
>
> > On Wed, Sep 15, 2010 at 10:27 AM, Aniket Deshpande <meetani...@gmail.com>wrote:
>
> >> Hmmm... Interesting...
> >> A couple of solutions spring to mind:
> >> 1) You could use the Google Language API Spell Checker... The problem here
> >> is that you will have to send a few words at a time to the API, so I don't
> >> know how feasible it would be...
> >> 2) You could use the Microsoft Office Libraries and use the MSWord library
> >> to do your spell check... (You will have to use Selenium with C# for
> >> this)...
>
> >> Lemme know what you think
>
> >> - ANIKET
>
> >> On 15 September 2010 10:09, Deepa Kiran Bagepalli <
> >> deepa.kiran1...@gmail.com> wrote:
>
> >>> Thanks for ur reply Niraj,
> >>> What i am looking for is , I want to read the content of the page. (ie)
> >>> When we opened the given URL, i want to read the whole text on that web
> >>> page.
> >>> Please suggest any solution of doing this
>
> >>> Thanks & Reply
> >>> Deepa
>
> >>> On Tue, Sep 14, 2010 at 11:19 PM, Niraj Kumar <bewith.ni...@gmail.com>wrote:
>
> >>>> try like this
>
> >>>> var str = "http://test.com/test1/test2/test3/test4";
>
> >>>> var SplitResult = str.split("/");
>
> >>>> System.out.println(" The first element is " + SplitResult[0]);
> >>>> System.out.println(" The second element is  " + SplitResult[1]);
> >>>> System.out.println(" The third element is  " + SplitResult[3]);
> >>>> System.out.println(" The forth element is  " + SplitResult[4]);
> >>>> System.out.println(" The fifth element is  " + SplitResult[5]);
>
> >>>> hope this would help or let me know
> >>>> On Tue, Sep 14, 2010 at 6:05 PM, Deepa <deepa.kiran1...@gmail.com>wrote:
>
> >>>>> Hi Friends,
> >>>>> I want to read all the existing text of the given URL and check
> >>>>> whether spellings of the text are proper. It also must find the words
> >>>>> whose spellings are not proper.
> >>>>> Is there any way to do this.
>
> >>>>> Help from any body would be greatly appreciated.
> >>>>> Plz........
>
> >>>>> Thanks & Regards
> >>>>> Deepa
>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "Selenium Users" group.
> >>>>> To post to this group, send email to seleniu...@googlegroups.com.
> >>>>> To unsubscribe from this group, send email to
> >>>>> selenium-user...@googlegroups.com<selenium-users%2Bunsu...@googlegroups.com>
> >>>>> .
> >>>>> For more options, visit this group at
> >>>>>http://groups.google.com/group/selenium-users?hl=en.
>
> >>>> --
> >>>> Thanks & Regard,
> >>>> Niraj Kumar
>
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "Selenium Users" group.
> >>>> To post to this group, send email to seleniu...@googlegroups.com.
> >>>> To unsubscribe from this group, send email to
> >>>> selenium-user...@googlegroups.com<selenium-users%2Bunsu...@googlegroups.com>
> >>>> .
> >>>> For more options, visit this group at
> >>>>http://groups.google.com/group/selenium-users?hl=en.
>
> >>>  --
> >>> You received this message because you are subscribed to the Google Groups
> >>> "Selenium Users" group.
> >>> To post to this group, send email to seleniu...@googlegroups.com.
> >>> To unsubscribe from this group, send email to
> >>> selenium-user...@googlegroups.com<selenium-users%2Bunsu...@googlegroups.com>
> >>> .
> >>> For more options, visit this group at
> >>>http://groups.google.com/group/selenium-users?hl=en.
>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Selenium Users" group.
> >> To post to this group, send email to seleniu...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> selenium-user...@googlegroups.com<selenium-users%2Bunsu...@googlegroups.com>
> >> .
> >> For more options, visit this group at
> >>http://groups.google.com/group/selenium-users?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Selenium Users" group.
> > To post to this group, send email to seleniu...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > selenium-user...@googlegroups.com<selenium-users%2Bunsu...@googlegroups.com>
> > .
Reply all
Reply to author
Forward
0 new messages