Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Group info
Language: English
Group categories:
Computers > Software
More group info »
Active older topics
Discussions
View:  Topic list, Topic summary Topics 1 - 10 of 982  Older »

You cannot post messages because only members can post, and you are not currently a member.
Description: User group for Cascading users
 

FTP Tap 
  Has anyone know if there is an FTP tap ? Or has anyone tried to create one ? I'm interested in the possibility of creating one if not. Best Fede -- Federico Brubacher @fbru02
By Federico Brubacher  - Jun 4 - 2 new of 2 messages    

Sink to database in 2.0-WIP 
  Hi, I'm trying to write Tap and Sheme classes for database-output using 2.0-WIP using way like used in hadoop-example 'DbCountPageView': *org.apache.hadoop.mapred.lib. db.DBConfiguration.configureDB (jobConf, driverClassName, connectionUrl);* *org.apache.hadoop.mapred.lib. db.DBOutputFormat.setOutput(jo bConf,... more »
By Ivan Zelensky  - Jun 1 - 2 new of 2 messages    

AggregateBy and sort support 
  Hi Chris, While looking into implementing a FirstBy, I was expecting that I could specify sorting fields in the AggregateBy constructor. This would then let me efficiently use First as my aggregator. But AggregateBy currently doesn't let you specify sorting. In the 1.2.5 code it looks like this would be a trivial change, as it would just get used in the initialize() method when setting up the GroupBy:... more »
By Ken Krugler  - Jun 1 - 7 new of 7 messages    

Line numbers in Hadoop 
  Hi, We're using cascading for validating files submitted by users. We want to report errors with line numbers to the users. So if they wrote a string where an int is expected, we'd like to say "Line 45: field X should be an int". I understand that hadoop cannot provide this information since it... more »
By Philippe Laflamme  - May 31 - 1 new of 1 message    

Set-Similarity Join on Multiple Attributes 
  I saw a previous post<[link]>regarding set-similarity joins for deduplicating data. How would I implement this fuzzy join on multiple attributes? I am trying to implement an entity resolution solution using Hadoop. I could, for instance, use the... more »
By SSS  - May 29 - 2 new of 2 messages    

Crawling using thee scalding framework 
  Hi, I am looking for a crawling a site . The contents of apage can have: 1 can have information to extract + 2 some more urls to crawl(pagination) I tried to follow the bixo tuorial [link], but is wondering if there is anything similar in the scalding(by... more »
By Jyotirmoy Sundi  - May 25 - 4 new of 4 messages    

Does cascading-jdbc module does connection pooling 
  Hi I have a requirement of loading aggregated data created by cascading flow to DB. The file will be of few 100 MB.I was thinking of using cascading-jdbc but I am worried that it does not lead to DDOS attack on my DB because of many connections being opened by cascading program. Does cascading-jdbc have something like connection pool so that it... more »
By Amit Anand  - May 24 - 1 new of 1 message    

Can pig scripts be run as part of cascading flow 
  Can I execute a pig script from my cascading flow as I have a requirement of joining data from two files in different format based on common key which I believe doing in cascading will be difficult. So I was thinking of using pig script to do the same as part of my cascading flow Please suggest... more »
By Amit Anand  - May 24 - 3 new of 3 messages    

Urgent Client Position ::: Senior Websphere Systems Administrator - Work Location : Eldorado Hill , CA 
  Would someone please ban this spammer? Thanks. ... ... -- Dean Wampler "Functional Programming for Java Developers", "Programming Scala", and "Programming Hive" (forthcoming) - O'Reilly twitter: @deanwampler, @chicagoscala [link]
By Dean Wampler  - May 23 - 2 new of 2 messages    

GroupBy(NONE,FIRST) when input is UNKNOWN 
  Should I be able to do this? I want to Bring everything onto one reducers (Group by Fields.NONE), and then sort by whatever is in the first field. (Fields.FIRST). This works if the incoming fields are known, but fails if they are unknown. cascading.tuple.TupleException : given tuple not same size as position... more »
By jd  - May 23 - 4 new of 4 messages    

1 - 10 of 982   « Newer | Older »

XML       Send email to this group: cascading-user@googlegroups.com