GSOC: : Improved search engine capabilities  Bottom

  • Hi all, I'm one of the Google summer of coders for this year and i just wanted to say Hi and also create this topic as an area of discussion for my project. In that hopefully you guys can give me some helpful hints and tips on using Postnuke, as well as request particular features you think might be useful if included.

    Instead of re-writing my entire proposal I'll just link to it (http://community.postnuke.com/Downloads-req-getit-lid-29.htm).

    I certainly don't underestimate the value of all of your opinions, so i'm more than willing to listen anything anyone has to say. I'm sure as a collective you offer a huge vault of knowledge which I think I'll need to tap into in order for my project to be a proper success.

    My first question is, what is the best way to get my head properly around Postnuke? Should I read the documentation? If so which pages or best? Or am i better off just digging into the code and looking around?

    Thanks



    edited by: Urchin, Apr 22, 2008 - 11:05 PM
  • Hello,

    Do you have Skype? Contact me via Skype (ammodump or David Pahl). I am one of the PN support staff, here. So I would be glad to chat with you (in real time).

    Welcome to the community,

    --
    David Pahl
    Zikula Support Team
  • Hi,
    Unfortunatley I don't have a microphone at the moment but I'm probably gonna get one over the weekend.
    Hammerhead told me that Skype was in regular use by a lot of the staff so I'll get involved as soon as i can.

    Thanks for the welcome.
  • Actually, I we just use typing chat, not VOIP. So speaker and mic are not required.

    --
    David Pahl
    Zikula Support Team
  • Welcome to Postnuke!

    The best way to start with Postnuke is playing around with it:
    1. Install it
    2. Install a module or 2
    3. Install a theme

    Once you know all this a bit you might start reading the development documentation which is a bit rough but it gives you a good hint on where to look in the code. Then there are some example modules in the SVN which should serve as examples for features. There is one for the workflows one for the categories ASO. They consist of more comments than code so you can learn from them.

    As you know the your project is about enhancing the search engine: ATM every module contains an API function so the search module kows how to pass the search term to the module and get back the result. In the end you have a bunch of resultsets from all your modules. If you search for the word "car" you get the according results from the news, from the comments, from the calendar ASO. The search engine doesn't care that some of the results point to the same URL - for example when the word "car" appears in an article and the comments for that article. IIRC we would like to have some kind of a page based search where your result consists of unique pages - no matter if the word is in a comment or the article.

    But I guess your mentor will already told you all about it.

    --
    best regards from Kiel, sailing city

    Steffen Voss

    Member of the Zikula Steering Committee
    Read The Zikulan's Blog "If you want people to RTFM, make a better FM!"
  • Hi!

    Just posted in the news (http://community.pos…le=article&sid=2904). Had not seen this thread. What I said there was in short: welcome, please just have a quick look at Zend_Search_Lucene http://devzone.zend.com/node/view/id/91?

    Cheers
  • Kaffeeringe.de: thanks for the tips on getting started. very comprehenisve

    dits: had a look at the zend_search_lucene. It looked like it had some promising similarities, although i'm not convinced with it not using a relational database, don't really see why that's neccesary or advantageous. Thanks for the heads up
  • Urchin

    although i'm not convinced with it not using a relational database, don't really see why that's neccesary or advantageous


    It's probably done for 1. easy of implementation and 2. performance.

    It seems it possible to implement a custom storage method by overriding Zend_Search_Lucene_Storage_Directory and Zend_Search_Lucene_Storage_File classes. (http://framework.zen…cene.extending.html)
  • We might like to have a look at http://xapian.org/ , too, I love it for sites where performance is important.
    Greetings,
    Chris

    --
    an operating system must operate
    development is life
    my repo
  • slam

    We might like to have a look at http://xapian.org

    Xapian is nice too! And it's grass roots apparently lie in ... Cambridge icon_lol
  • i think you can begin by watching the module blank, which will give the basic "template" of postnuke module

    For your projet, without talking about caching, i think performance will very important, as search in a website take a lot of cpu load ( perhaps some stress test with apache benchmark will be interesting)

    Moreover, as there is an "Ajax implementation(server side)" in postnuke, will you consider it in your project, or do you want to focus on the engine part ?



    edited by: mumuri, Apr 24, 2008 - 01:36 AM
  • A very nice feature would be to be able to build indices per categories.
  • -the multi-file-format search would be excellent
    -nested searching/search refinements/search filtering would also be great
    -http://www.samsung.com/ca/consumer/type/type.do?group=homeappliances&type=airconditioners: auto complete searching (just an idea)
    -google-like: did you mean _______?
  • dits/slam,
    I was wondering if you could tell me whether the high performance of xapian is derived mainly from the fact
    that its written in C++ (which wont be much good for my projecT) or whether it just has good algorithms/Data structures.

    Thanks

This list is based on users active over the last 60 minutes.