[GSoC2012] Content Personalisation and Targeting module

141 views
Skip to first unread message

Yuki Awano

unread,
Apr 1, 2012, 1:04:06 PM4/1/12
to silverst...@googlegroups.com
Hi, I'm Yuki Awano, a computer science undergraduate student at Kyoto University.
I have some experience in PHP and CSS, HTML.
While I have created websites with Wordpress and Concrete 5, truly speaking, I'm really new to SliverStripe.
I got to know SliverStripe in the project list of Google Summer of Code.

When I saw the idea list, I really got interested in the idea: Content Personalization and Targeting Module.

Because I think that major CMSs do not support content personalization.
Currently, content personalization is a difficult task for those who is not good at programming.

In designing content personalization module, I think these points are important.

1. Visitors should be able to control their information.
From privacy reason, visitors can delete there information. And the owner of website can easily create the control page for visitors..

2. Owner can use the module easily.
This module should be so easy to use for those who is not good at programming or web technologies.
Users can make rules without writing programming codes.

In implementing this module, I think the big problems is performance.
For making the content personalized, I think we need store the access logs of every user.
Especially in large sites, the data would become huge, and the transaction would be too heavy for the SQL..

I know there are key-value datastores which performs better than the SQLs, such as MongoDB.
However, currently MongoDB is not in the required environment of SilverStripe.
When this module requires MongoDB, users should install the database.
This makes installation of SilverStripe more difficult.

In this module, datastore layer should be implemented as an abstract layer.
Users can select which database to use, MongoDB(No-SQL) or default SQL database.

I think this module can change the way of content personalization with CMS.
I am really interested to this project.

I hope to recieve some opinions and suggestions from everyone.

Regards,
Yuki

xeraa

unread,
Apr 1, 2012, 4:59:30 PM4/1/12
to silverst...@googlegroups.com
Hi Yuki,

While I can't say much about personalization / targeting (but it's definitely a great project, so please do apply), I can add some thoughts on the persistence stuff:

In implementing this module, I think the big problems is performance.
For making the content personalized, I think we need store the access logs of every user.
Especially in large sites, the data would become huge, and the transaction would be too heavy for the SQL..

I know there are key-value datastores which performs better than the SQLs, such as MongoDB.

MongoDB is not a key-value datastore, it's generally counted towards the document stores...
In general, with NoSQL you will first need to establish how your data looks like and how you want to interact with it (eg. what kind of queries you need). Only then you can evaluate which NoSQL solution will fit your requirements best as the different implementations are much more diverse than in the RDBMS arena.
 
However, currently MongoDB is not in the required environment of SilverStripe.
When this module requires MongoDB, users should install the database.
This makes installation of SilverStripe more difficult.

In this module, datastore layer should be implemented as an abstract layer.
Users can select which database to use, MongoDB(No-SQL) or default SQL database.

The current ORM only supports RDBMS (MySQL, PostgreSQL, MS SQL - not sure about the current status of SQLite and Oracle).
Within your project you won't have time to extend the ORM to support NoSQL solutions - you'll need to work on personalization / targeting. We did think about offering a project dedicated to adding MongoDB support, but it was deemed to be too complex and too tightly coupled with core, so that we dropped the idea...

While performance is important, IMHO the main focus of your project is to create a solid basis for personalization and targeting. You won't be able to solve all problems in this area within three months.

Cheers,
Philipp

Yuki Awano

unread,
Apr 2, 2012, 9:44:35 AM4/2/12
to silverst...@googlegroups.com
Hi,  Philipp,

Thank you for your suggestions, and sorry for the late reply.

When I can get accepted to GSoC, I will focus on the basis for personalization and targeting.
Performance is a future work.

I have a question in the project idea page.

Why the rules of audience types are written in "code" ?
I think the rules could be created with the interface as that of creating a smart playlist in iTunes.
And it will be much easier for users who are not good at programming.
Is it because there is not much time to implement the UI?
Or is it because codes could represent more complex rules?

I am happy to hear any opinions or suggestions about this.

Thanks,
Yuki



2012年4月2日月曜日5時59分30秒 UTC+9 xeraa:

xeraa

unread,
Apr 2, 2012, 5:39:07 PM4/2/12
to silverst...@googlegroups.com
Hi Yuki,
 
Why the rules of audience types are written in "code" ?
I think the rules could be created with the interface as that of creating a smart playlist in iTunes.
And it will be much easier for users who are not good at programming.
Is it because there is not much time to implement the UI?
Or is it because codes could represent more complex rules?

I can't really say anything about the specific case (maybe Sig can help us here), but this is probably based on SilverStripe's philosophy in general:
We try to keep as much as possible "in code" as it is easy to version (in Git or SVN), update between local/dev/production, and share between sites. Additionally, seasoned developers feel more comfortable and are probably quicker when working with code rather than having to click through a lot of interfaces.

Personal experience: Drupal is pretty GUI heavy for configurations. IMHO it's a nightmare to keep your configuration the same between local, dev, and production - especially when a project is already online and you want to extend it. You can't simply copy DB dumps around as you'd overwrite your live data and merging DBs is something I don't want to do.

Cheers,
Philipp

Yuki Awano

unread,
Apr 3, 2012, 12:24:38 AM4/3/12
to silverst...@googlegroups.com
Hi Philipp,

Thank you for your reply.

I got pros of code.
I have a experience on wordpress of copying an environment from development environment to production environment.
It was like a nightmare.
Especially copying preferences of plugins took lots of hard work.

I am going to apply for this project.
Thanks so much for your suggestions and opinions.

Regards,
Yuki



2012年4月3日火曜日6時39分07秒 UTC+9 xeraa:

Sigurd Magnusson

unread,
Apr 5, 2012, 6:17:26 PM4/5/12
to silverst...@googlegroups.com
Yuki, thanks for signing up. As this is a new module and to some extent, an experiment, the idea is to quickly get something up and running that can be tested on real sites, and extended and refined based on actual use and results. So, performance should be given some thought, but comes second in importance.

Why the rules of audience types are written in "code" ? I think the rules could be created with the interface as that of creating a smart playlist in iTunes. And it will be much easier for users who are not good at programming.
Is it because there is not much time to implement the UI?
Or is it because codes could represent more complex rules?

You've pretty well answered the question here :) I do expect some GUI options, and while a GUI for fully defining rules would be nice, that would take too much time and it's not clear exactly what the rules interface would need to do yet. As we have nothing yet, we need to create the underlying features first. Code also provides more flexibility and scope for experimentation. 

Sigurd (who raised this project idea initially)

Yuki Awano

unread,
Apr 6, 2012, 1:44:43 AM4/6/12
to silverst...@googlegroups.com
Hi, Sigurd., thank you for your reply.
I am happy to hear comment from who raised this project idea initially.

As I mentioned before, I am really interested in this project.
Because I think this module would become the first module that provides content personalization and targeting feature for users, and content personalization is useful for visitors.

With the discussions above, I decided to focus on the basis of content personalization and targeting module in GSoC.

As I wrote in my proposal of GSoC, I will implement DSL(Domain Specific Language) for defining audience type.
I think this way of implementation could make it easier to implement the basis and also to build GUIs for this in the future work.

In addition, we currently know the problems that will happen in this module, such as performance problem, I am able to implement this module with considering those problems.
For example, I can implement abstract layers for caching and no-sql database of the future work.

I think the rules should be implemented as loosely-coupled to the core of content personalization.
And this module should be easily extendable and flexible.
For evaluating this module in the early time, I am planning to finish the implementation of the DSL and core of content personalization before the mid term evaluation.

Thanks,
Yuki



2012年4月6日金曜日 7時17分26秒 UTC+9 Sigurd Magnusson:
2012年4月6日金曜日 7時17分26秒 UTC+9 Sigurd Magnusson:
2012年4月6日金曜日 7時17分26秒 UTC+9 Sigurd Magnusson:

Ingo Schommer

unread,
Apr 19, 2012, 6:04:21 PM4/19/12
to silverst...@googlegroups.com
Hello Yuki,

I'm just trying to get my head around the idea, and my first instinct is always to look for
similar work that has been done for other CMS or frameworks :)

I've found the following interesting posts and solutions. Maybe there's some interesting starting points for you in there?

Are you aware of any libraries or external services we can reuse for this?

Does this idea overlap with Mark's "dynamic templates" module? http://www.silverstripe.org/dynamic-templates-module/
It's mainly for a/b variation testing, with template devs uploading new templates - could be extended to support more criteria than simple randomized distribution?
Although its not 100% "CMS author friendy" due to using SS's template syntax.

We only have limited support for configurable "content pieces" in the CMS, which a non-technical user can add to his page (through the "widget" API).
Most of the customization happens on a page level, which is too broad for content personalization. How are you planning to work around this?
We *could* mandate usage of widgets (which are then made to conditionally show), and I've seen a few successful implementations of this.
But it would limit focus to a very specific way (and under-represented) way of managing content in SilverStripe.

Thanks
Ingo

xeraa

unread,
Apr 19, 2012, 11:55:50 PM4/19/12
to silverst...@googlegroups.com
I'll try to add a few ideas (it's already really late / early so I hope this makes sense to others / to me in the morning ;-)):

1) Could this be handled in a similar fashion as translations? Instead of a language you could select or be auto-selected for a context? However, this might be a total overkill, both in terms of scope for the project and work for site maintainers.

2) I think the indicators for the selection of a context are so far dynamic (OS, location, search query,...). Should you also be able to select the environment in a static way (part of the registration / user profile) or is this beyond the scope of this project?

3) Are there already any specific plans for the selection of the environment? The more I think about it, the more I like how the desktop application ControlPlane (http://www.controlplaneapp.com) handles this:
* You define the available contexts (for example local visitor, remote visitor, visitor looking for X,...)
* You define the evidence sources you want to use (in our case that would be the location, a search string that has been used,...)
* Define the rules (if the country is NZ it's 80% a local visitor, if the country is Europe it's 95% a remote visitor, if the user has searched for X he should be in the X context by 90%,...) - the highest percentage wins the context.
* Based on the context, you can define different actions (what content to display).
This is much too complex for us, but maybe we can scale that down in one way or another?


Cheers,
Philipp

Yuki Awano

unread,
Apr 20, 2012, 10:32:17 AM4/20/12
to silverst...@googlegroups.com
Hello Ingo and Phillip,

Thank you for your opinions and sorry for my late reply.
Currently I know no library that we can reuse for this module.

When I checked the plugins and services that Ingo picked up(Thank you, Ingo.), it seems that they are using third party servers to store user data. Using other servers can solve some problems, performance, difficulty of installation, etc. However I think this module should not use any third party server to store user information. Because using third party server means lock-in.

As Phillip mentioned in the above, I think the biggest problem in designing this module is how to reduce site maintenance cost and how users can customize their content with our module.

As Ingo mentioned, there are two ways to customize contents, template and widgets. And I think the pros and cons of them are below.

*Template

Pros: Area which an user can customize is large. Easy to use for advanced users.
Cons: Difficult to use for beginners.

*Widgets

Pros: Easy to use and maintain for beginners.
Cons: Area which an user can customize is limited.

While it is difficult to use, I thought that template would be better than widgets because template can customize much and would be easy to use for advanced users.

I think content personalization is really related to theme of the site, and people who customize content personalization settings would be able to edit template. Of course this is my assumption, so there need more discussions or data.

The last link Ingo wrote is really important to design this module.
Maintanance cost is a big problem in content personalization.

I think one of the solution for this is using audience types.
Define audience types in configuration page, and users use audience types to personalize their content.
This will reduce maintenance cost a little.
This idea may be a little similar to the idea that Philipp mentioned in (3).

Providing visitors an interface to select their audience type is a good idea.
This will provide transparency of content personalization to visitors and when a user visited the site from another computer or browser, user can get the personalized page quickly.

I think content personalization of this module should not be a personalization that is based on a complex algorithm, like google does.

As phillip mentioned, I think language selection is a good example of content personalization.

This should need more discussion.
I am really welcome to any opinion to this.

Thanks,
Yuki

Yuki Awano

unread,
May 7, 2012, 2:20:02 AM5/7/12
to silverst...@googlegroups.com
Hi,

I am currently writing a design doc of Content Personalization Module.
You can check it from the below link.

Design doc

And the project page is below.

This document is hosted on Google Docs, and anyone can comment freely.
We are really welcome to any idea, opinion and comment about this module.
Please feel free to comment to the doc, or reply to this post.

We will start to write codes of this module from May 21, and until the day we want to define specification of this module as possible.

I hope that this would become a great discussion.
Thanks in advance.

Regards,
Yuki
Reply all
Reply to author
Forward
0 new messages