Closed Bug 962853 Opened 10 years ago Closed 10 years ago

vcs-sync needs to be able to publish git-hg mappings to mapper

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mozilla, Assigned: pmoore)

References

Details

Attachments

(2 files, 3 obsolete files)

Via the POST /mapper/PROJECT/insert/ignoredups api.

Ideally we save a copy of the full mapfile when it's successful, and diff the built mapfile against the saved mapfile and only insert the new entries (and save the full mapfile if it's successful, rinse, repeat).
Blocks: 929336
Depends on: 847640
When this is done, update b2g_bumper to use the new url:
http://hg.mozilla.org/build/mozharness/file/1d65ffa9b514/configs/b2g_bumper/master.py#l28
Blocks: 962863
Depends on: 842189
Will work on this next, while mapper is getting deployed to web cluster staging environment...
Incidentally, I'll also see if I can patch the mozharness code to provide git notes at the same time for the hg changeset shas. Haven't worked out an equivalent for updating hg yet with git shas - but this can be added at a later time.
Working on this here: https://github.com/petemoore/build-mozharness/tree/bug962853
Please note this is not finished work, just providing the link for sake of transparency as I work on it.
I'm most likely going to r- for adding requests as a pre-run dependency.  If we checked this in, all vcs-sync processes would die and wouldn't restart without human intervention, and we would have to install requests on all new vcs-sync boxes before starting a new vcs-sync process.

You can install requests in your virtualenv by

* adding requests to the virtualenv_modules http://hg.mozilla.org/build/mozharness/file/b4245fa9628a/configs/vcs_sync/build-repos.py#l65
* importing requests as needed after the create-virtualenv step, like this: http://hg.mozilla.org/build/mozharness/file/b4245fa9628a/mozharness/mozilla/testing/device.py#l634

That would make the scripts pick up the new requirement, install requests, and continue on without manual intervention.
Assignee: nobody → pmoore
Thanks Aki!

OK code all written, now just testing:
https://github.com/petemoore/build-mozharness/compare/bug962853

As soon as I'm happy it is all working properly, I'll submit a review request.

Pete
Please also note I've only initially added for build-repos (where I'll be testing) but of course I'll be adding to all the configs so that all push to mapper. Also please note I haven't put authentication in there yet for pushing to mapper. In other words - please hold off reviewing until I'm ready - just providing the links above for transparency.
https://github.com/petemoore/build-mozharness/compare/bug962853#diff-6855a10b776d4ad8c3140dcf6ecd73deR18

https://docs.python.org/2/library/sets.html?highlight=sets#sets
Deprecated since version 2.6: The built-in set/frozenset types replace this module.

This isn't a review, just pointing out something I noticed when glancing at the code.
(In reply to Aki Sasaki [:aki] from comment #8)
> https://github.com/petemoore/build-mozharness/compare/bug962853#diff-
> 6855a10b776d4ad8c3140dcf6ecd73deR18
> 
> https://docs.python.org/2/library/sets.html?highlight=sets#sets
> Deprecated since version 2.6: The built-in set/frozenset types replace this
> module.
> 
> This isn't a review, just pointing out something I noticed when glancing at
> the code.

Thanks Aki - good spot - I'll get it using frozenset on my next deployment.

Please note I'm testing this in parallel to bug 847640 in staging now - see Testing in staging, see: https://bugzilla.mozilla.org/show_bug.cgi?id=847640#c19 for details.
Attached patch bug962853_mozharness.patch (obsolete) — Splinter Review
This is the vcs part to use the new mapper, and to publish the git notes. I've had this working in staging, and also tested things like bringing down mapper, disabling the github account, etc - and checked that we got warning emails, and that vcs sync was able to automatically recover when systems were available again.

I also ran tests that even after disruptions, the mappings in the database stayed consistent with the map files, even though only deltas are published.

I'm quite confident this should be working well.

Let me know if you need more info, or would like to meet discuss.

Enjoy!
Pete
Attachment #8430325 - Flags: review?(aki)
Comment on attachment 8430325 [details] [diff] [review]
bug962853_mozharness.patch

Could you use a -U9 to your diff for more context next time?

r- especially because we're going to affect gecko.git, gecko-dev, gecko-projects as well with this commit, and it's not clear that's intentional.

>diff --git a/dist/mozharness-0.7-py2.7.egg b/dist/mozharness-0.7-py2.7.egg
>new file mode 100644
>index 0000000..4c72d86
>Binary files /dev/null and b/dist/mozharness-0.7-py2.7.egg differ

I don't think you want to add this file.

>diff --git a/mozharness/base/vcs/vcssync.py b/mozharness/base/vcs/vcssync.py
>index 9dee2e7..0c81870 100755
>--- a/mozharness/base/vcs/vcssync.py
>+++ b/mozharness/base/vcs/vcssync.py
>@@ -43,10 +43,10 @@ class VCSSyncScript(VCSScript):
>             error_contents = self.get_output_from_command(
>                 ["egrep", "-C5", "^[0-9:]+ +(ERROR|CRITICAL|FATAL) -", info_log],
>                 silent=True,
>-            )
>+            )[0:10240] # limit to 10KB in size (large emails fail to send)
>         if fatal:
>             subject = "[vcs2vcs] Failed conversion for %s" % job_name
>-            text = message + '\n\n'
>+            text = message[0:10240] + '\n\n' # limit message to 10KB in size (large emails fail to send)
>         if not self.successful_repos:
>             subject = "[vcs2vcs] Successful no-op conversion for %s" % job_name
>         if error_contents and not fatal:

a) can we detect when we pass this mark and add something to the email to say we're truncating?
b) what did you run into that hit this limit?

>diff --git a/scripts/vcs-sync/vcs_sync.py b/scripts/vcs-sync/vcs_sync.py
<snip>
>+import shutil

I don't think you need this.

>@@ -86,6 +89,8 @@ class HgGitScript(VirtualenvMixin, TooltoolMixin, TransferMixin, VCSSyncScript):
>                 'create-virtualenv',
>                 'update-stage-mirror',
>                 'update-work-mirror',
>+                'create-git-notes',
>+                'publish-to-mapper',

Adding these to the default_actions will also affect gecko.git, gecko-dev, and gecko-projects.  Are you intending to roll them out across all of those after only testing the build repos, or is this an oversight?  If you've done testing around gecko.git, let me know.  If not, I think you should remove it from the default actions, and override the default actions in the build repos config file.

I would also like some way to skip all git note behavior via config.

>@@ -409,6 +414,8 @@ intree=1
>             """
>         commands = []
>         if refs_list:
>+            # to skip pushing tags, uncomment (useful for testing)
>+            # refs_list = [x for x in refs_list if 'tags' not in x]

This is harmless, but did you consider altering your tag_config instead?

>@@ -473,7 +480,8 @@ intree=1
>                 # We query hg for these because the conversion dir will have
>                 # branches from multiple hg repos, and the regexes may match
>                 # too many things.
>-                refs_list = []
>+                # always include refs/notes/commits for git notes taken of hg shas
>+                refs_list = ['+refs/notes/commits:refs/notes/commits']

This behavior should be optional, especially since it's specific to one direction (hg->git), while I believe this script will handle git->git and git->hg as well.

>+    def pull_out_new_sha_lookups(self, old_file, new_file):
>+        """ This method will return an iterator which iterates through lines in file
>+            new_file that do not exist in old_file. If old_file can't be read, all
>+            lines in new_file are returned. It does not cause any problems if lines
>+            exist in old_file that do not exist in new_file. Results are sorted by
>+            the second field (text after first space in line).
>+
>+            This is somewhat equivalent to:
>+               ( [ ! -f "${old_file}" ] && cat "${new_file}" || diff "${old_file}" "${new_file}" | sed -n 's/> //p' ) | sort -k2"""
>+        try:
>+            with open(old_file, 'rt') as old:

self.opened() might work here?

>+        with open(new_file, 'rt') as new:

And here.

>+            repo = repo_config['repo']
>+            if os.path.exists(delta_git_notes):
>+                os.remove(delta_git_notes)

self.rmtree() ?

>+            with open(delta_git_notes, "w") as delta_out:

self.write_to_file() ?
Could possibly also use self.opened() to give you a filehandle.

>+            if git_notes_adding_successful:
>+                shutil.copyfile(complete_mapfile, added_to_git_notes)

self.copyfile() ?

>+    def publish_to_mapper(self):
>+        """ This method will attempt to create git notes for any new git<->hg mappings
>+            found in the generated_mapfile file and also push new mappings to mapper service."""
>+        for repo_config in self.query_all_repos():
>+            dest = self.query_abs_conversion_dir(repo_config)
>+            # 'git-mapfile' is created by hggit plugin, containing all the mappings
>+            complete_mapfile = os.path.join(dest, '.hg', 'git-mapfile')
>+            # 'published-to-mapper' is all the mappings that are known to be published
>+            # to mapper, for this project (typically the 'git-mapfile' from the previous
>+            # run)
>+            published_to_mapper = os.path.join(dest, '.hg', 'published-to-mapper')
>+            # 'delta-for-mapper' is the set of mappings that need to be published to
>+            # mapper on this iteration, i.e. the diff between the previous two files
>+            # described
>+            delta_for_mapper = os.path.join(dest, '.hg', 'delta-for-mapper')
>+            if os.path.exists(delta_for_mapper):
>+                os.remove(delta_for_mapper)

self.rmtree()

>+            # pushed to mapper
>+            mapper_config = repo_config.get('mapper', {})
>+            if mapper_config:
>+                site_packages_path = self.query_python_site_packages_path()
>+                if site_packages_path not in sys.path:
>+                    sys.path.append(site_packages_path)
>+                try:
>+                    import requests
>+                except BaseException as e:

Probably ImportError.

>+                mapper_url = mapper_config.get('url')
>+                mapper_project = mapper_config.get('project')
>+                insert_url = "%s/%s/insert/ignoredups" % (mapper_url, mapper_project)

You're allowing for 'url' and 'project' not existing (by using .get()), and then assuming they exist.  We should probably fail here somehow if they don't.  An easy way is by getting rid of the .get()s:

mapper_url = mapper_config['url']
mapper_project = mapper_config['project']

>+                with open(delta_for_mapper, "w") as delta_out:
>+                    delta_out.write("".join(all_new_mappings))

self.write_to_file()

>+                if publish_successful:
>+                    # if we get this far, we know we could successfully post to mapper, so now
>+                    # we can copy the mapfile over "previously generated" version
>+                    # so that we don't push to mapper for these commits again
>+                    shutil.copyfile(complete_mapfile, published_to_mapper)
>+            else:
>+                shutil.copyfile(complete_mapfile, published_to_mapper)

self.copyfile() for these 2.

>@@ -877,6 +1013,7 @@ intree=1
>         for repo_config in self.query_all_repos():
>             if self.query_failure(repo_config['repo_name']):
>                 self.info("Skipping %s." % repo_config['repo_name'])
>+                # comment out continue below to force push
>                 continue

When have you needed to do this?
Attachment #8430325 - Flags: review?(aki) → review-
Thanks Aki.

I've applied your suggested changes, and am retesting. The changes I made are sat here, if you are curious:
https://github.com/petemoore/build-mozharness/commit/1c28acded694467ed7b7314c5996d16def8b32ef

Will send back for review after testing.

Thanks!
Pete
Hi Aki,

Again, thanks for your thorough review, very much appreciated.

To answer the specific questions you raised:

(In reply to Aki Sasaki [:aki] from comment #11)

> Could you use a -U9 to your diff for more context next time?

I will certainly do that - and this is a point nthomas already picked me up on - apologies I forgot (I thought I'd updated my git config, but clearly not, and I should have double checked the diff before attaching - so doubly bad of me).


> >diff --git a/dist/mozharness-0.7-py2.7.egg b/dist/mozharness-0.7-py2.7.egg
> >new file mode 100644
> >index 0000000..4c72d86
> >Binary files /dev/null and b/dist/mozharness-0.7-py2.7.egg differ
> 
> I don't think you want to add this file.

I really should have checked the diff before attaching - apologies for this.

> 
> >diff --git a/mozharness/base/vcs/vcssync.py b/mozharness/base/vcs/vcssync.py
> >index 9dee2e7..0c81870 100755
> >--- a/mozharness/base/vcs/vcssync.py
> >+++ b/mozharness/base/vcs/vcssync.py
> >@@ -43,10 +43,10 @@ class VCSSyncScript(VCSScript):
> >             error_contents = self.get_output_from_command(
> >                 ["egrep", "-C5", "^[0-9:]+ +(ERROR|CRITICAL|FATAL) -", info_log],
> >                 silent=True,
> >-            )
> >+            )[0:10240] # limit to 10KB in size (large emails fail to send)
> >         if fatal:
> >             subject = "[vcs2vcs] Failed conversion for %s" % job_name
> >-            text = message + '\n\n'
> >+            text = message[0:10240] + '\n\n' # limit message to 10KB in size (large emails fail to send)
> >         if not self.successful_repos:
> >             subject = "[vcs2vcs] Successful no-op conversion for %s" % job_name
> >         if error_contents and not fatal:
> 
> a) can we detect when we pass this mark and add something to the email to
> say we're truncating?
> b) what did you run into that hit this limit?

a) If updated to say when we truncate
b) Actually, it was a code error, causing an ERROR to occur for each git commit when it tried to create a git note - after fixing the broken code, we shouldn't get it again. But it alerted me to the fact that there is a limit to the size of message we can send, so I thought it would be wise to add a maximum size in, to protect us against major fallout where for an unknown reason we'd get a huge number of errors, and make sure we still get the email. So to be clear - the only problem it helped with was a code error, which is now fixed, but I think it is good future-proofing, and does no harm.

> 
> >@@ -877,6 +1013,7 @@ intree=1
> >         for repo_config in self.query_all_repos():
> >             if self.query_failure(repo_config['repo_name']):
> >                 self.info("Skipping %s." % repo_config['repo_name'])
> >+                # comment out continue below to force push
> >                 continue
> 
> When have you needed to do this?

This was when I made a fix to the git notes. There had not been any new commits, but I changed the format of the git notes, which meant I needed to push, despite their being no code changes. Another example might be if e.g. a github repo is compromised, and then we need to force push it back to its original state, even though there are no changes upstream. Or maybe someone in our team accidentally commits against the github repo (I believe we personally have the privileges to push commits to the build-* repos, even though only vcs sync should be doing that - this would be another way the repo could get corrupted, and a force push despite no changes in hg, would be a way to fix it). I can remove the comment if you like.

So - this was just to answer the questions. All the other points you raised, I've tried to implement in the code, and that is now back in testing again, together with the new mapper. Since the changes are not *major* I hope I can get them tested quite quickly next week.

Have a nice weekend!
Pete

P.S. At some point I'll try to write some unit tests for vcs sync too...
Disclaimer: I could not apply my original patch because of bogus 'dist/mozharness-0.7-py2.7.egg' file:

error: cannot apply binary patch to 'dist/mozharness-0.7-py2.7.egg' without full index line
error: dist/mozharness-0.7-py2.7.egg: patch does not apply

So I manually removed this bogus file from the original patch, so that I could apply it, and then created this incremental patch from there.

I'll also create a full patch to replace the old one (https://bug962853.bugzilla.mozilla.org/attachment.cgi?id=8430325) so that you can see it both ways (incremental patch and full patch)...
Attachment #8434422 - Flags: review?(aki)
A full patch, replacing the previous one.

Marking both as r? but of course you only need to review one of them.
Attachment #8430325 - Attachment is obsolete: true
Attachment #8434423 - Flags: review?(aki)
(In reply to Pete Moore [:pete][:pmoore] from comment #13)
> > >@@ -877,6 +1013,7 @@ intree=1
> > >         for repo_config in self.query_all_repos():
> > >             if self.query_failure(repo_config['repo_name']):
> > >                 self.info("Skipping %s." % repo_config['repo_name'])
> > >+                # comment out continue below to force push
> > >                 continue
> > 
> > When have you needed to do this?
> 
> This was when I made a fix to the git notes. There had not been any new
> commits, but I changed the format of the git notes, which meant I needed to
> push, despite their being no code changes. Another example might be if e.g.
> a github repo is compromised, and then we need to force push it back to its
> original state, even though there are no changes upstream. Or maybe someone
> in our team accidentally commits against the github repo (I believe we
> personally have the privileges to push commits to the build-* repos, even
> though only vcs sync should be doing that - this would be another way the
> repo could get corrupted, and a force push despite no changes in hg, would
> be a way to fix it). I can remove the comment if you like.

Yeah, I think I deleted the repo_update.json to force everything to push.
This probably means we need some other way to signal the process to do a full push (or per-repo or per-target push).  This is fine for now.
Comment on attachment 8434422 [details] [diff] [review]
An incremental patch against my previous patch (https://bug962853.bugzilla.mozilla.org/attachment.cgi?id=8430325)

>-            text = message[0:10240] + '\n\n' # limit message to 10KB in size (large emails fail to send)
>+            text = ''
>+            if len(message) > 10240:
>+                text += '*** Message below has been truncated: it was %s characters, and has been reduced to 10240 characters:\n\n' % len(message)
>+            text += message[0:10240] + '\n\n' # limit message to 10KB in size (large emails fail to send)

text will end up being >10240 characters.  I'm guessing that's fine, and 10240 is an arbitrary limit?

>         if error_contents:
>-            text += '\n%s\n\n' % error_contents
>+            if len(error_contents) > 10240:
>+                text += '\n*** Message below has been truncated: it was %s characters, and has been reduced to 10240 characters:\n' % len(error_contents)
>+            text += '\n%s\n\n' % error_contents[0:10240] # limit message to 10KB in size (large emails fail to send)

Here too.
Comment on attachment 8434423 [details] [diff] [review]
An updated mozharness patch with all changes so far

r+ based on the interdiff + a quick unit.sh pass.
Thanks!
Attachment #8434423 - Flags: review?(aki) → review+
Comment on attachment 8434422 [details] [diff] [review]
An incremental patch against my previous patch (https://bug962853.bugzilla.mozilla.org/attachment.cgi?id=8430325)

The interdiff helped, thanks for posting both!
Attachment #8434422 - Flags: review?(aki)
Depends on: 1023576
BTW, after migrating to an new database schema for mapper in staging (bug 1023561) - rather than sharing relengapi schema, I hit a new bug when repopulating mapper. I discovered this bug was introduced in patch attachment 8434422 [details] [diff] [review] from comment 14. The bug was in using the self.opened(...) construct. I should have written:

with self.opened(...) as (...):
    if err:
        <fail code>
    else:
        <pass code>

but I missed the "else:" in all three instances where I used it, and instead had:

with self.opened(...) as (...):
    if err:
        <fail code>
    <pass code>

Therefore in the case or error, both <fail code> and <pass code> ran, instead of just <fail code>. :(

So I have fixed this now in the following commit:
https://github.com/petemoore/build-mozharness/commit/261da887d524f8af557b315f1d6c23984837a357

I'll attach a patch.

Pete
Don't be fooled by the rather large diff - that is mostly because one block of code got indented - so it looks like a big removal block and a big addition block, but in reality, it just got indented because of the added "else:" above it...

Thanks Aki!
Attachment #8434422 - Attachment is obsolete: true
Attachment #8439340 - Flags: review?(aki)
Updating simply to keep full patch and delta patches in-sync.
Attachment #8434423 - Attachment is obsolete: true
Attachment #8439340 - Flags: review?(aki) → review+
Attachment #8439340 - Flags: checked-in+
This is now live, for the build-* repos (vcs2vcs on github-sync2.dmz.scl3.mozilla.com).

e.g. sample API 'GET' calls:

https://api.pub.build.mozilla.org/mapper/build-puppet/rev/git/69d64a8a1
https://api.pub.build.mozilla.org/mapper/build-puppet/rev/hg/68f1b2b9996c4e33
https://api.pub.build.mozilla.org/mapper/build-puppet/mapfile/full
https://api.pub.build.mozilla.org/mapper/build-mozharness/mapfile/since/29.05.2014%2017:02:09%20CEST


Caveats:

It is *not* live for gecko-dev (vcs2vcs on vcssync1.srv.releng.usw2.mozilla.com).
It is *not* live for gecko-projects (vcs2vcs on vcssync2.srv.releng.usw2.mozilla.com).
It is also not live for l10n and gecko-git which are not in production yet.

Aki, I can now test enabling publish-to-mapper (and git notes generation) in staging for both gecko-dev and gecko-projects - maybe this is just a few days work to set this up and test it, or I can finish the l10n rollout first. Which is higher priority?

Thanks,
Pete
Flags: needinfo?(aki)
(I assume I should not close this bug until all projects are publishing to mapper)
(In reply to Pete Moore [:pete][:pmoore] from comment #23)
> This is now live, for the build-* repos (vcs2vcs on
> github-sync2.dmz.scl3.mozilla.com).

Awesome, thanks Pete!

Hm, these two print out *both* changesets for me:
> https://api.pub.build.mozilla.org/mapper/build-puppet/rev/git/69d64a8a1

69d64a8a18e6e001eb015646a82bcdaba0e78a24 68f1b2b9996c4e33aa57771b3478932c9fb7e161

> https://api.pub.build.mozilla.org/mapper/build-puppet/rev/hg/68f1b2b9996c4e33

69d64a8a18e6e001eb015646a82bcdaba0e78a24 68f1b2b9996c4e33aa57771b3478932c9fb7e161

Is this intentional?  I was expecting the other changeset and nothing else.  Otherwise our tooling that queries mapper will need to all get rid of the cruft, when it should be easiest to just return the cruftless value.
Hoping this is a quick fix.

> Caveats:
> 
> It is *not* live for gecko-dev (vcs2vcs on
> vcssync1.srv.releng.usw2.mozilla.com).
> It is *not* live for gecko-projects (vcs2vcs on
> vcssync2.srv.releng.usw2.mozilla.com).
> It is also not live for l10n and gecko-git which are not in production yet.
> 
> Aki, I can now test enabling publish-to-mapper (and git notes generation) in
> staging for both gecko-dev and gecko-projects - maybe this is just a few
> days work to set this up and test it, or I can finish the l10n rollout
> first. Which is higher priority?

I think it will be good to test gecko-dev and gecko-projects and gecko.git before rolling new-vcs gecko.git into production.  L10n rollout can happen on its own.  In the end, it's the same amount of work that needs to be done.

Currently it looks like l10n has no scripting that relies on mapper (I'd like to readd this, but it appears to be removed atm), so afaict can roll out completely and replace all legacy l10n conversion, unless we end up mirroring partner git repos a la bug 1025390 comment 3 and bug 997652 comment 28.  gecko.git will require that we run legacy gecko.git conversion in parallel until we switch over the mapper consumers, e.g. http://hg.mozilla.org/build/mozharness/file/316033a17e14/mozharness/mozilla/mapper.py , to use new mapper.

So it's not a matter of priority, but l10n is a discrete task that can be rolled out in its entirety, and gecko* is a larger bundle of tasks that need to happen within short order of each other.  Because of this, if I were you, I'd focus on and finish l10n first to juggle fewer balls in the air at any one point.  It's up to you, though.

(In reply to Pete Moore [:pete][:pmoore] from comment #24)
> (I assume I should not close this bug until all projects are publishing to
> mapper)

Up to you.  We do need some bug tracking l10n and gecko.git mapper population.  This can be that bug, or we can add them to the existing l10n and gecko.git bugs, or file new.
Flags: needinfo?(aki)
(In reply to Aki Sasaki [:aki] from comment #25)
> (In reply to Pete Moore [:pete][:pmoore] from comment #23)
> Hm, these two print out *both* changesets for me:
> > https://api.pub.build.mozilla.org/mapper/build-puppet/rev/git/69d64a8a1
> 
> 69d64a8a18e6e001eb015646a82bcdaba0e78a24
> 68f1b2b9996c4e33aa57771b3478932c9fb7e161
> 
> > https://api.pub.build.mozilla.org/mapper/build-puppet/rev/hg/68f1b2b9996c4e33
> 
> 69d64a8a18e6e001eb015646a82bcdaba0e78a24
> 68f1b2b9996c4e33aa57771b3478932c9fb7e161
> 
> Is this intentional?  I was expecting the other changeset and nothing else. 
> Otherwise our tooling that queries mapper will need to all get rid of the
> cruft, when it should be easiest to just return the cruftless value.
> Hoping this is a quick fix.

Ah ok. Well I can change it if you like, but I'll just explain why I like it currently this way: when you query with a SHA that is less than 40 chars, and it matches, you get back the fully qualified 40 character SHA in other words, in the example above, we are querying against the git SHA 69d64a8a1, and when it gives back both the matched git sha, and the corresponding hg sha, we know that the full/complete git SHA was 69d64a8a18e6e001eb015646a82bcdaba0e78a24. In other words, it is giving you information that you didn't provide (and therefore may not have) - so it is a way also to get the full sha, if you only have a short part. I thought it was better to be explicit by giving back the matched git SHA too - but if this is a problem, I can of course change it. An alternative might be to return a json response, which is a bit more canonical, which could contain the upstream url of the source repository, the date it was added, the hg sha, the git sha, the project name, etc. With a json response it should be easy for most clients to just pull out the parameters they are interested in. What do you think?

> So it's not a matter of priority, but l10n is a discrete task that can be
> rolled out in its entirety, and gecko* is a larger bundle of tasks that need
> to happen within short order of each other.  Because of this, if I were you,
> I'd focus on and finish l10n first to juggle fewer balls in the air at any
> one point.  It's up to you, though.

Sound advice. I'll get l10n done first.

> (In reply to Pete Moore [:pete][:pmoore] from comment #24)
> > (I assume I should not close this bug until all projects are publishing to
> > mapper)
> 
> Up to you.  We do need some bug tracking l10n and gecko.git mapper
> population.  This can be that bug, or we can add them to the existing l10n
> and gecko.git bugs, or file new.

I'll rename this bug slightly (to say vcs-sync needs to be able to populate mapper) since this functionality is done, and I'll add separate bugs for the "switch it on for project X", which can be tracked separately.

Thanks Aki!
Summary: vcs-sync needs to populate mapper db once it's live → vcs-sync needs to be able to publish git-hg mappings to mapper
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
(In reply to Pete Moore [:pete][:pmoore] from comment #26)
> (In reply to Aki Sasaki [:aki] from comment #25)
> > (In reply to Pete Moore [:pete][:pmoore] from comment #23)
> > Hm, these two print out *both* changesets for me:
> > > https://api.pub.build.mozilla.org/mapper/build-puppet/rev/git/69d64a8a1
> > 
> > 69d64a8a18e6e001eb015646a82bcdaba0e78a24
> > 68f1b2b9996c4e33aa57771b3478932c9fb7e161
> > 
> > > https://api.pub.build.mozilla.org/mapper/build-puppet/rev/hg/68f1b2b9996c4e33
> > 
> > 69d64a8a18e6e001eb015646a82bcdaba0e78a24
> > 68f1b2b9996c4e33aa57771b3478932c9fb7e161
> > 
> > Is this intentional?  I was expecting the other changeset and nothing else. 
> > Otherwise our tooling that queries mapper will need to all get rid of the
> > cruft, when it should be easiest to just return the cruftless value.
> > Hoping this is a quick fix.
> 
> Ah ok. Well I can change it if you like, but I'll just explain why I like it
> currently this way: when you query with a SHA that is less than 40 chars,
> and it matches, you get back the fully qualified 40 character SHA in other
> words, in the example above, we are querying against the git SHA 69d64a8a1,
> and when it gives back both the matched git sha, and the corresponding hg
> sha, we know that the full/complete git SHA was
> 69d64a8a18e6e001eb015646a82bcdaba0e78a24. In other words, it is giving you
> information that you didn't provide (and therefore may not have) - so it is
> a way also to get the full sha, if you only have a short part. I thought it
> was better to be explicit by giving back the matched git SHA too - but if
> this is a problem, I can of course change it. An alternative might be to
> return a json response, which is a bit more canonical, which could contain
> the upstream url of the source repository, the date it was added, the hg
> sha, the git sha, the project name, etc. With a json response it should be
> easy for most clients to just pull out the parameters they are interested
> in. What do you think?

json wfm.
So does having a separate urls to give you the full mapping (e.g. mapfile where hg sha starts with abcde12345), or just the corresponding sha. 

> > So it's not a matter of priority, but l10n is a discrete task that can be
> > rolled out in its entirety, and gecko* is a larger bundle of tasks that need
> > to happen within short order of each other.  Because of this, if I were you,
> > I'd focus on and finish l10n first to juggle fewer balls in the air at any
> > one point.  It's up to you, though.
> 
> Sound advice. I'll get l10n done first.

This will also help a great deal next merge day (5 weeks from this past monday)... the large patch I showed you ~7 weeks ago will be needed then if we're not switched over.

If there are time consuming staging passes needed for gecko* that we can easily start now with little manual intervention, then starting those and letting them run while you work on l10n might also be good.  I'll leave it up to you.

> I'll rename this bug slightly (to say vcs-sync needs to be able to populate
> mapper) since this functionality is done, and I'll add separate bugs for the
> "switch it on for project X", which can be tracked separately.
> 
> Thanks Aki!

wfm, thank you.
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: