Welcome!

Search





Page: Proposal_NewUrlStructure


Target ReleaseRoller 3.0
Original AuthorAllenGilliland
See alsoProposal_NewUrlImplementation
StatusUnder review


Redesign Roller's url structure to provide a consolidated url space per weblog and alleviate some technical limitations of the current url structure.

Summary#

Roller's current url structure has a variety of technical limitations and does not support a well defined url space per weblog. Urls are spread out somewhat erratically accross the application url space and the current design makes it difficult to add new features to the urls. In order to improve upon these issues we need to start from the ground up and redesign the url structure.

This is page 1 of a 2 page proposal. This first page will cover the general approach to redesigning the url structure and will propose how the new urls will look. The second proposal will provide more technical details about how the implementation will happen.

Requirements#

  • Provide a consolidated url space per weblog
  • Include support for multi-language blogs
  • Maintain backwards compatability for old urls

Related JIRA issues:

Issues#

  • Make sure old urls are properly discontinued and redirected to the new urls
  • Protecting various urls that should not be mistaken as weblogs

Design#

How it will work:

  • Define a new url structure.
  • Move all existing servlets to a new location.
  • EOL old urls and redirect them to new urls.

The proposed url structure#

The Theory:

  • /<weblog>
  • /<weblog>(/lang)/<ctx>
  • /<weblog>(/lang)/<ctx>/extra/info

the main point of this structure is to 1) provide an individual url space per weblog and 2) make sure that we maintain flexible control over that url space. the key is that we maintain proper control over the <ctx> portion of the url because it ensures that we can continue to add on to the url structure in the future without running into problems where the url space is constrained. we want to avoid variable or user defined data from being used at the <ctx> level of the url.

Basic Url Overview:

  • /<weblog>
  • /<weblog>/entry/<anchor>
  • /<weblog>/date/<YYYYMMDD>
  • /<weblog>/category/<category>
  • /<weblog>/tags/java+frameworks+spring
  • /<weblog>/feed/<flavor>
  • /<weblog>/page/<page link>
  • /<weblog>/resource/<uploaded file>
  • /<weblog>/search?q=<query>

Language Support (<lang> is a 5 char locale identifier like "ja_JP"):

  • /<weblog>/<lang>
  • /<weblog>/<lang>/entry/<anchor>
  • /<weblog>/<lang>/date/<YYYYMMDD>
  • /<weblog>/<lang>/category/<category>
  • /<weblog>/<lang>/tags/java+frameworks+spring
  • /<weblog>/<lang>/feed/<flavor>
  • /<weblog>/<lang>/page/<page link>
  • /<weblog>/<lang>/resource/<uploaded file>
  • /<weblog>/<lang>/search?q=<query>

Language is optional. only a small percentage of the blogs on most sites will want to maintain their blog in multiple languages, so we don't want to force a language identifier in all urls. if a language identifier exists in the url it will always restrict the view to only entries/content published in that language.

Choosing a default language. users will be allowed to choose a default language for their weblog which will be used for all urls that don't include a language identifier in the url. users may also choose to include all languages in their default view if they don't plan to use any language specific urls.

Entry Collection views:

  • /<weblog>/date/<YYYYMMDD>?page=1
  • /<weblog>/category/<category>?page=1
  • /<weblog>/tags/java+frameworks+spring

for the main types of collections (date based, category, and tags) we will offer a nice path based url. these path based urls will NOT allow constraint by additional criteria and will only support a limited number of additional query parameters for things like paging.

  • /<weblog>?category=<cat>&date=<date>&tags=foo+bar&page=1

the main page of the blog will serve a collection of entries constrained by any number of criteria specified by query params. these urls will not be promoted as widely in the software since they shouldn't be needed too often, but they will be available.

Permalinks and posting comment and trackback data:

  • /<weblog>/entry/<anchor>
  • /<weblog>/<lang>/entry/<anchor>

post data to permalink urls. the first benefit is that we aren't required to maintain a whole new url like /<weblog>/comment/<anchor> just for posting data, so it keeps our url space trimmed. secondly, it just makes sense because people POST comments/trackbacks to entries.

Feed Urls:

  • /<weblog>/feed/<flavor>
  • /<weblog>/feed/rss?cat=<cat>&excerpts=true
  • /<weblog>/feed/atom?cat=<cat>&excerpts=true

this is for any content feeds that are available. currently we have rss, rss091, rss092, and atom. we expect comment feeds to be a likely addition in the near future.

User Defined Pages:

  • /<weblog>/page/<page link>

this section is for any special pages that the user wants to make out of their templates. a good example would be a css or js file to use in their weblog. custom pages will have access to any query parameters appended to the url in their templates, and no path based extensions will be allowed.

User Uploaded Resources:

  • /<weblog>/resource/<file name>

this works just as it does now, serving up any static files uploaded by the user.

Search Results:

  • /<weblog/search?q=<search query>&cat=<category>&page=4

Tags Support:

  • /<weblog>/tags/java+frameworks+spring
  • /<weblog>/feed/rss?tags=java+frameworks+spring

Move existing servlets to a new location#

There are many reasons to do this, but the most important reason is that in order to successfully transition to the new url structure we need to EOL the old urls and redirect them. This is important because if the old urls continue to work then they will continue to be used, which is not ideal.

In order to ensure that we can continue to add/remove/modify any of Roller's servlets without affecting the overall url space for an installation we will plan to move all servlets under a single root url path like /roller-servlets/*. These servlets will be hidden away and not meant for direct access. For example we might have ...

  • /roller-servlets/page/*
  • /roller-servlets/resources/*
  • /roller-servlets/comment/*
  • /roller-servlets/feed/*

This same approach would apply for the Roller editor/admin interface as well. We would like to move away from the current urls of /editor/* and /admin/* and migrate to a consolidated url space for the entire interface. Something like /roller-UI/* would be appropriate.

EOL old urls and redirect them to their new location#

After all of the old servlets have been properly moved to their new locations we will have to provide redirect service for the old urls. All old urls will get a 301 Permanently Moved http redirect response pointing it to the new url.

Key decisions and points of discussion#

these are just some items that require a little discussion and group decision.

  • file extensions? does it make sense to add a standard .xxx file extension to all of the new urls to better identify content types? this could make it a lot easier to provide a future version of roller that allows for generating static content files.
    • this was discussed for a while on the mailing list and the community came to the decision that file extensions are not needed, so the new url strcture will not make use of file extensions.
  • for multi-language blogs, should urls without an explicit language identifier use browser language preferences? this seems like a pretty good idea.

Comments#

Add comments here.

SeanGilligan: Looks really good, Allen! I'm still interested in completing the Spring MVC proposal (long couple of days ... huh?). I have written some code just to validate things for myself, but I haven't tested it yet -- having the URL structure will help with that. I truly believe it will help with URL migration and general flexibility/customization. One of the benefits of using Spring MVC controllers is that you have a lot greater flexibility in URL mapping. Spring MVC uses a dispatcher servlet and a set of controllers. I will propose refactoring all/most/many of the servlets into controllers, but you can have the Spring MVC dispatcher map URLs to existing servlets. Now that we have a proposed URL structure, I'll look at doing it with the built in SimpleUrlHandlerMapping, but the option of writing a custom mapping is there if necessary.

  • AllenGilliland: Great! I am definitely still interested in seeing how Spring may be able to help with the url mapping. Just remember that as I said before, to really make this a viable option I will need to see a strong proof of concept or prototype, so I'll need some code to look at.
  • SeanGilligan: The start of the proposal is here: Proposal_VelocityToSpringMVC (I'm going to attach a code diff from some experimental refactoring that I did, that will give you an idea of what the code might look like when done properly)

SeanGilligan: Is there a written description of <anchor>, <flavor>, <page link> somewhere that we can link to?

  • AllenGilliland: No there isn't, but maybe i should add that. The quick version is that <anchor> represents the unique identifier for an entry which is used in the current permalink urls via ?entry=<anchor>. <flavor> is just an identifier for the feed type, like "rss" or "atom" and we can add any other feeds like "comment", which Elias had suggested. <page link> corresponds directly to the "link" field for a template file and just identifies how a given template is mapped to a url.

SeanGilligan: I'm not sure if browser language settings should be used to *filter* out entries, that could be confusing to some users. I think they should only be used if an entry is available in multiple formats. In other words, browser settings could be used to select among languages for a permalink, but should not be used to filter out entries in a search URL.

  • AllenGilliland: Hmm, good point. One thing to remember though is that users who decide to blog in multiple languages will always have to choose a default language which may or may not be "all languages". So I may do my blog in English, Spanish, and Chinese and have English as the default. Wouldn't it make sense that someone going to my blog with Spanish as their preferred browser language get my blog in Spanish instead of English? In the case where the author chooses "all languages" as their default then we would not want to constrict based on browser language settings.
    • SeanGilligan: To rephrase what I was trying to say above: In the case where a user is visiting/browsing/reading a weblog, their browser language settings should only be used to choose among alternate languages when a resource is available in multiple languages. If the "resource" is a "collection" then the browser setting should be ignored and all "resources" should be returned. We probably need to specify/clarify this on a case-by-case basis. For example a "day page" should return all entries for that day, when an individueal entry is available in multiple languages, the visitor's preferred language should be used, otherwise the entry should still be returned in whatever language it is available in. The same thing for search results, although there could be a checkbox to select this behavoir.
      • SimonPhipps: I agree with this. The vision should be to return the full blog with my preferred language where possible. Skipping entries should not be the default behaviour. Another thing to consider is that the sequence in which languages should be selected if available is probably a preference the blog owner should set. So, in China I might want the order (chinese 1) -> (chinese 2) -> (english) whereas in Italy I might want (Italian) -> (spanish) -> (french) -> (english) becuase I know local readers more often speak Spanish and French as a second language than English.

SeanGilligan: Is it worthwhile to consider reserving URLs or defining a recommended mechanism for adding URLs from plugins or extensions? For example, I've thought about creating a capability for users to create a set of named feeds that they can create via a web gui and can enumerate in a velocimacro (or blog tag) - how would these be mapped into the URL space?

  • AllenGilliland: I'm not sure that I understand what you are proposing here. Users are always given complete control over the /<weblog>/page/* and /<weblog>/resource/* sections because those parts map to any user defined templates or user uploaded resources. The rest of the urls are basically handled by the system and are provided for users.
  • SeanGilligan: I'm thinking someone might want to add a /<weblog>/feeds/* section to their blog that gives alternate feed functionality. This functionality would be implemented in either some kind of Roller plug-in or in a locally modified version of Roller that has an additional Servlet or SpringMVC Controller, etc. We should think a little about these types of "use cases" since many people -- including me ;) -- may be wanting to extend Roller with local modifications.
    • DaveJohnson: a user could do that with a custom page in Roller 2.0 and should be able to continue to do that in Roller 3.0.
      • SeanGilligan: But what if the plugin needs to add a whole tree of pages? Or it needs new Java code to load new items into the Model/Context? What if it doesn't even want to use a Template/View?

DaveJohnson: It may be a bad practice to offer both RSS and Atom feeds of the very same content. Perhaps we should offer only Atom 1.0 feeds. Atom is an well specified IETF standard and all parser libraries and aggregators support it.

  • AllenGilliland: Actually, I have seen a number of sites that offer both and I think that makes the most sense. Let clients pick whatever format they like best rather than force that decision on them. Plus, I would really hate to alienate any people that need/want rss instead of atom for whatever reason. I say we support auto discovery for both.
  • AnilGangolli: I agree with Allen. What's the bad practice concern?

AnilGangolli: I don't see any <ctx> space under which combining date and tag or category restrictions is possible. (Why not?)

  • AllenGilliland: That is handled by the query param version of the entry collection views ... /<weblog>?category=<cat>&date=<date>&tags=foo+bar&page=1

Add new attachment

In order to upload a new attachment to this page, please use the following box to find the file, then click on “Upload”.
« This page (revision-3) was last changed on 17-Jan-2009 09:12 by DaveJohnson