The Problem
The programs that make up Bouncer all generate the paths for downloadables the same way:
${mirror_basepath}/${pv_path}/${oss_path}/${lang_path}/${file_name}
Some products are exceptions to this rule and need a different path:
${mirror_basepath}/${pv_path}/${file_name}
The bouncer programs would be more flexible if they could account for this difference. This problem comes up both in extending Bouncer for OpenOffice and the newly discovered Mozilla seamonkey non-conforming URL.
Accepted Solution
In the 'files' table, replace the name of the file with a reference to a pathname template. Add a new table called 'templates'.
Further, to refactor all the Bouncer programs in the use of this new template, add a new column to the 'downloadable' table called 'url'. When Sentry populates this table, it could build the complete URL for the downloadable at that time. Any other program using the 'downloadable' table, as well as Sentry itself, could use the prebuilt URL instead of generating the value on the fly.
Add to 'product_versions' short names for the product and version. This will assist in generating actual file names.
Summary of DB Schema Changes
| table name |
Column Name |
Use |
type |
| downloadable |
downloadable_url |
add new |
varchar(255) |
| files |
template_id |
add new |
int |
| files |
name |
removed |
|
| product_versions |
pv_short_product_name |
add new |
varchar(255) |
| product_versions |
pv_short_version |
add new |
varchar(255) |
| templates |
template_id |
new |
int |
| templates |
template_text |
new |
varchar(255) |
| sentry_log |
sentry_log_id |
new |
int (auto) |
| sentry_log |
downloadable_id |
new |
int |
| sentry_log |
sentry_log_message |
new |
varchar(255) |
Previous Proposals
Proposal 1
Give the 'products_versions' table an additional field called 'path_template'. The values in this field are strings in the format used in the problem statement above: column names from the database used as substitution variables in the classic UNIX style. Any text without the dollar sign prefix is considered literal text. Punctuation, such as '/', are also considered literals. It's easy to understand. Only users adding new products would encounter this template field. The default value that the user sees should be configurable.
Further, to refactor all the Bouncer programs in the use of this new template, I propose adding a new column to the 'downloadable' table called 'url'. When Sentry populates this table, it could build the complete URL for the downloadable at that time. Any other program using the 'downloadable' table, as well as Sentry itself, could use the prebuilt URL instead of generating the value on the fly.
This has several advantages.
First, it localizes the interpretation of the template to one place. Should the interpretation need to change, the code only needs to change in one place. No other programs that use the 'downloadable' table need to know how to interpret the template.
Second, at least for Sentry, this can simplify the SQL used to create a URL. When Sentry wants a URL, it currently executes a query to get the various parts and then creates the URL on the fly. If the URL is prebuilt, Sentry's queries do not need multilevel joins. While this makes for more efficient SQL, its greater value is in the simplification of the code itself. The simpler things look in the code, the easier it will be for future maintainers.
Proposal 1 Discussion
This works for the Mozilla Seamonkey problem, but it does not help the OpenOffice Problem. The Mozilla project has the path exception based on the product. The OpenOffice project's path exception is basedon the language not the product.
Consider this case for OpenOffice: Prod1Ver1 for US_EN would be ${mirror_basepath}/stable/${file_name} while Prod1Ver1 for ko_kr would be ${mirror_basepath}/localizations/${file_name}. Having the template associated with 'Product_verisions' just doesn't quite work.
Proposal 2
Add a new table called 'url_templates'. It would have five columns: id, pv, os, lang, template (expand actual names to fit the scheme of the whole system). Alternatively, it may be simpler to put this data in a configuration file.
For Mozilla, the table could look like this:
| id |
pv |
os |
lang |
template |
| 1 |
|
|
|
${base_path}/${pv_path}/${oss_path}/${lang_path}/${file_name} |
| 2 |
SeamonkeyID |
|
|
${base_path}/${pv_path}/${file_name} |
for OpenOffice, it would look like this:
| id |
pv |
os |
lang |
template |
| 1 |
|
|
|
${base_path}/localizations/${file_name} |
| 2 |
|
|
US_EN_ID |
${base_path}/stable/${file_name} |
Just like in the previous proposal, we'd add the 'url' column to the 'downloadable' table and prebuild the url for any downloadable.
It would be Sentry's responsibility to interpret the 'url_templates' table and apply the appropriate template given the product, os and language that it has at hand. Sentry would have to find the template that has the most specific match for the three variables.
Proposal 2 Discussion
This is a rudimentry rules system. What would Sentry do with conflicts bewteen rules? Perhaps assign a weight based on the id number of the rule? A conflict could be resolved by always choosing the rule with the higher id number.
What about lists in the table cells to save making multiple entries? I don't think we need to take this that far. It's rudimentary and should stay that way.
Argument to use a config file instead of a table - The actual use of the data in the table is not one that is done using SQL. Sentry has to download the entire table to use the rules. Why put something in the database if SQL is not going to be used as the primary means to use it? This also simplifies the user interface. Imagine a config file addition to Sentry like this:
pathTemplateRules=(('*', '*', '*', '${mirror_basepath}/${pv_path}/${oss_path}/${lang_path}/${file_name}'),
( 1 , '*', '*', '${mirror_basepath}/${pv_path}/${file_name}'))
This would have to be all in one line, though. This is very Pythonic in form, so it should be easy integrate into Sentry. For a complex rule system, it gets pretty verbose and begs for a easier user interface.
Proposal 3
In the 'files' table, replace the name of the file with a pathname template. Just like with the previous proposals, add the 'url' column to the 'downloadable' table to prebuild the entire URL.
So that we can allow munging of the file name itself, I propose adding a couple more columns to the 'product_versions' table: pv_short_product_name and pv_short_version. The values in these fields would be available as variables to the templates. That way we could make templates with a finer granularity for variable content:
${pv_path}/${os_path}/${lang_path}/${pv_short_product_name} ${pv_short_version}.dmg
Proposal 3 Discussion
This simplifies the table structure over Proposal 2 while keeping Proposal 2's fine granularity.
Brilliant!
So, how could we go about munging filenames and not just directories?
Say:
{$mirror_basepath}/{$pv_path}/filename_{$os}_{$lang}.{$extension}
?