Dashboard > Bouncer > ... > Bouncer 2 > Non-conforming Path Problem
Bouncer Log In   View a printable version of the current page.
Non-conforming Path Problem
Added by K Lars Lohn, last edited by K Lars Lohn on Mar 23, 2005  (view change)
Labels: 
(None)

The Problem

The programs that make up Bouncer all generate the paths for downloadables the same way:

${mirror_basepath}/${pv_path}/${oss_path}/${lang_path}/${file_name}

Some products are exceptions to this rule and need a different path:

${mirror_basepath}/${pv_path}/${file_name}

The bouncer programs would be more flexible if they could account for this difference. This problem comes up both in extending Bouncer for OpenOffice and the newly discovered Mozilla seamonkey non-conforming URL.

Accepted Solution

In the 'files' table, replace the name of the file with a reference to a pathname template. Add a new table called 'templates'.

Further, to refactor all the Bouncer programs in the use of this new template, add a new column to the 'downloadable' table called 'url'. When Sentry populates this table, it could build the complete URL for the downloadable at that time. Any other program using the 'downloadable' table, as well as Sentry itself, could use the prebuilt URL instead of generating the value on the fly.

Add to 'product_versions' short names for the product and version. This will assist in generating actual file names.

Summary of DB Schema Changes

table name Column Name Use type
downloadable downloadable_url add new varchar(255)
files template_id add new int
files name removed  
product_versions pv_short_product_name add new varchar(255)
product_versions pv_short_version add new varchar(255)
templates template_id new int
templates template_text new varchar(255)
sentry_log sentry_log_id new int (auto)
sentry_log downloadable_id new int
sentry_log sentry_log_message new varchar(255)

Previous Proposals

Proposal 1

Give the 'products_versions' table an additional field called 'path_template'. The values in this field are strings in the format used in the problem statement above: column names from the database used as substitution variables in the classic UNIX style. Any text without the dollar sign prefix is considered literal text. Punctuation, such as '/', are also considered literals. It's easy to understand. Only users adding new products would encounter this template field. The default value that the user sees should be configurable.

Further, to refactor all the Bouncer programs in the use of this new template, I propose adding a new column to the 'downloadable' table called 'url'. When Sentry populates this table, it could build the complete URL for the downloadable at that time. Any other program using the 'downloadable' table, as well as Sentry itself, could use the prebuilt URL instead of generating the value on the fly.

This has several advantages.

First, it localizes the interpretation of the template to one place. Should the interpretation need to change, the code only needs to change in one place. No other programs that use the 'downloadable' table need to know how to interpret the template.

Second, at least for Sentry, this can simplify the SQL used to create a URL. When Sentry wants a URL, it currently executes a query to get the various parts and then creates the URL on the fly. If the URL is prebuilt, Sentry's queries do not need multilevel joins. While this makes for more efficient SQL, its greater value is in the simplification of the code itself. The simpler things look in the code, the easier it will be for future maintainers.

Proposal 1 Discussion

This works for the Mozilla Seamonkey problem, but it does not help the OpenOffice Problem. The Mozilla project has the path exception based on the product. The OpenOffice project's path exception is basedon the language not the product.

Consider this case for OpenOffice: Prod1Ver1 for US_EN would be ${mirror_basepath}/stable/${file_name} while Prod1Ver1 for ko_kr would be ${mirror_basepath}/localizations/${file_name}. Having the template associated with 'Product_verisions' just doesn't quite work.

Proposal 2

Add a new table called 'url_templates'. It would have five columns: id, pv, os, lang, template (expand actual names to fit the scheme of the whole system). Alternatively, it may be simpler to put this data in a configuration file.

For Mozilla, the table could look like this:

id pv os lang template
1
${base_path}/${pv_path}/${oss_path}/${lang_path}/${file_name}
2 SeamonkeyID
${base_path}/${pv_path}/${file_name}

for OpenOffice, it would look like this:

id pv os lang template
1
${base_path}/localizations/${file_name}
2
US_EN_ID ${base_path}/stable/${file_name}

Just like in the previous proposal, we'd add the 'url' column to the 'downloadable' table and prebuild the url for any downloadable.

It would be Sentry's responsibility to interpret the 'url_templates' table and apply the appropriate template given the product, os and language that it has at hand. Sentry would have to find the template that has the most specific match for the three variables.

Proposal 2 Discussion

This is a rudimentry rules system. What would Sentry do with conflicts bewteen rules? Perhaps assign a weight based on the id number of the rule? A conflict could be resolved by always choosing the rule with the higher id number.

What about lists in the table cells to save making multiple entries? I don't think we need to take this that far. It's rudimentary and should stay that way.

Argument to use a config file instead of a table - The actual use of the data in the table is not one that is done using SQL. Sentry has to download the entire table to use the rules. Why put something in the database if SQL is not going to be used as the primary means to use it? This also simplifies the user interface. Imagine a config file addition to Sentry like this:

pathTemplateRules=(('*', '*', '*', '${mirror_basepath}/${pv_path}/${oss_path}/${lang_path}/${file_name}'),
                     ( 1 , '*', '*', '${mirror_basepath}/${pv_path}/${file_name}'))

This would have to be all in one line, though. This is very Pythonic in form, so it should be easy integrate into Sentry. For a complex rule system, it gets pretty verbose and begs for a easier user interface.

Proposal 3

In the 'files' table, replace the name of the file with a pathname template. Just like with the previous proposals, add the 'url' column to the 'downloadable' table to prebuild the entire URL.

So that we can allow munging of the file name itself, I propose adding a couple more columns to the 'product_versions' table: pv_short_product_name and pv_short_version. The values in these fields would be available as variables to the templates. That way we could make templates with a finer granularity for variable content:

${pv_path}/${os_path}/${lang_path}/${pv_short_product_name} ${pv_short_version}.dmg

Proposal 3 Discussion

This simplifies the table structure over Proposal 2 while keeping Proposal 2's fine granularity.

Brilliant!

So, how could we go about munging filenames and not just directories?

Say:

{$mirror_basepath}/{$pv_path}/filename_{$os}_{$lang}.{$extension}

?

Posted by Corey Shields at Mar 11, 2005 07:02 | Permalink


We don't have to munge the file names themselves. Since we must store a hash value with for each unique file in the system, every unique file must have an entry in the 'files' table. As long as we have these entries for every file, we might as well store the complete file name, too. Why try to generate something that could be a constant?

Posted by K Lars Lohn at Mar 11, 2005 10:03 | Permalink


I think option 2 would give us the most flexibility. The user interface would have to be tweaked, though in order to allow for users to be able to add N localizations with minimal effort. I am working on a mock-up of the user interface that would be used to add 1,2,....,N files.

Posted by Michael Morgan at Mar 16, 2005 08:51 | Permalink


Actually, doing both 2 and 3 would probably be the way to go - isn't option 3 an extension of option 2?

Posted by Michael Morgan at Mar 16, 2005 08:53 | Permalink

No, option 3 supercedes the necessity of option 2. While in option 3 we may need an additoional table to cache templates, it's for the convenience of the user interface only. That table would not need the foreign keys to the product-versions, os and lang tables.

In option 2, Sentry would search the 'url_template' table for an appropriate template for a given file. In option 3, sentry only needs to just grab the template from the files table - there is no need to search.

Posted by K Lars Lohn at Mar 16, 2005 15:12 | Permalink


Doesn't pv_path become obsolete given our solution? If we are using a template to build the entire URL after the mirror_baseurl, then shouldn't we use pv_(product|version)_short_name explicitly and rely on the template to piece those together?

Posted by Michael Morgan at Mar 31, 2005 07:31 | Permalink
Site powered by a free Open Source Project / Non-profit License (more) of Confluence - the Enterprise wiki.
Learn more or evaluate Confluence for your organisation.
Powered by Atlassian Confluence, the Enterprise Wiki. (Version: 2.2.7 Build:#524 Jul 28, 2006) - Bug/feature request - Contact Administrators