This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
.github/workflows/
-
workflows/
1/3
issue-subscriber.yml
-
llvm/utils/git/
-
utils/
-
git/
5/6
github-automation.py

Differential D116762

workflows: Make issue-subscriber more robust for labels with special characters
ClosedPublic

Authored by tstellar on Jan 6 2022, 11:55 AM.

Download Raw Diff

Details

Reviewers

asl
kwk

Commits

rGa2adebf409ce: workflows: Make issue-subscriber more robust for labels with special characters

Summary

Also, replace the existing actionscript implementation with a python
script that can be run outside of GitHub Actions. The intention is
that going forward, all github action functionality would be implemented
in this script.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tstellar requested review of this revision.Jan 6 2022, 11:55 AM

tstellar created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2022, 11:55 AM

tstellar added a reviewer: kwk.Jan 6 2022, 12:10 PM

Harbormaster completed remote builds in B141948: Diff 397954.Jan 6 2022, 12:22 PM

Since this looks like the beginning of something new and big I highly suggest to consider using GraphQL instead of doing the many REST calls that are needed. Especially when you drive stuff from a github action you have almost everything you need right at your finger tips. I mean all the global IDs for issues and alike. I'm working on something for another project and here's my example of how one could encapsulate the github functionality into classes with easy usage: https://gist.github.com/kwk/c89b6a3e5eb40487fed78f226e982fcc#file-graphql-py-L138 .

Anyway this is just a thought.

.github/workflows/issue-subscriber.yml
15	I suspect that checking out the repo is not an option here? But anyways, if you absolutely need curl to download the script, then be aware of some issues with this approach. I noticed that curl sometimes queries an older version of the content than currently is on github. The solution could be this: curl \ --compressed \ -s \ -H 'Cache-Control: no-cache' \ https://raw.githubusercontent.com/$GITHUB_REPOSITORY/$GITHUB_SHA/llvm/utils/git/github-automation.py?$(uuidgen) You might wonder about the `--compressed` and caching options or even about the UUID being attached to the URL at the very end. These are all ways to ensure we get the freshest of all versions of the file on github. I got the inspiration for this from here: https://stackoverflow.com/questions/31653271/how-to-call-curl-without-using-server-side-cache?noredirect=1&lq=1
llvm/utils/git/github-automation.py
20	Being picky here, although this looks straight forward, I highly suggest that we use typed python: def __init__(self, token:str, repo:str, issue_number:int, label_name:str): This makes functions so much more readable IMHO.
36–40	`os.getenv()` is overloaded and accepts a second value to be the default if the environment variable is `None`. def get_default_repo(): return os.getenv('GITHUB_REPOSITORY', 'llvm/llvm-project')

This revision now requires changes to proceed.Jan 6 2022, 1:45 PM

Address review comments

tstellar marked 2 inline comments as done.Jan 6 2022, 11:04 PM

tstellar added inline comments.

.github/workflows/issue-subscriber.yml
15	I don't think any older content exists, because we are pulling a specific version of the file : GITHUB_SHA.

In D116762#3226065, @kwk wrote:

Since this looks like the beginning of something new and big I highly suggest to consider using GraphQL instead of doing the many REST calls that are needed. Especially when you drive stuff from a github action you have almost everything you need right at your finger tips. I mean all the global IDs for issues and alike. I'm working on something for another project and here's my example of how one could encapsulate the github functionality into classes with easy usage: https://gist.github.com/kwk/c89b6a3e5eb40487fed78f226e982fcc#file-graphql-py-L138 .

Anyway this is just a thought.

It would be nice if the PyGitHub module had wrappers for this already. What are the advantages of GraphQL vs the REST API?

Harbormaster completed remote builds in B142019: Diff 398052.Jan 7 2022, 12:14 AM

In D116762#3226744, @tstellar wrote:

In D116762#3226065, @kwk wrote:

Since this looks like the beginning of something new and big I highly suggest to consider using GraphQL instead of doing the many REST calls that are needed. Especially when you drive stuff from a github action you have almost everything you need right at your finger tips. I mean all the global IDs for issues and alike. I'm working on something for another project and here's my example of how one could encapsulate the github functionality into classes with easy usage: https://gist.github.com/kwk/c89b6a3e5eb40487fed78f226e982fcc#file-graphql-py-L138 .

Anyway this is just a thought.

It would be nice if the PyGitHub module had wrappers for this already. What are the advantages of GraphQL vs the REST API?

The advantages of GraphQL that I see is that you can aggregate more than one endpoint into one response. The following example is totally made up but it showcases what you can do: Suppose, you have an issue comment that links to another issue and you want to get the assignee of the other issue. You could totally get this information with one call to the GraphQL by doing nested queries. This also is the reason for why there's no wrapper for this and why you don't need one: GraphQL was designed so that you as a developer can specify what you get in the result, hence there's no way to wrap this.
The GraphQL explorer makes it extremely nice to tailor your queries/mutations to your needs by letting you define what (aggregated) fields the result should include.

Try out this example if you want:

{
  node(id: "IC_kwDOETcFfs47PM2o") {
    ... on IssueComment {
      body
      issue {
        id
      }
      author {
        login
      }
    }
  }
}

But comparing REST and GraphQL is really comparing apples and oranges. Let me forward this article and a quote from it:

Where GraphQL Shines

Unlike REST, GraphQL is designed to access multiple resources simultaneously. This means that you are not only able to be more precise in not only retrieving just the data you desire (something that is built into some of today’s modern RESTful APIs), but you are able to do so across multiple resources/ data models (with data joins automatically built in) in a single HTTP (or other applicable protocol) call.

GraphQL is also designed to be incredibly structured (so much so that the order of properties in the response is critical). This means clients will know exactly what to expect (and in what types) without having to pull in JSON or XML schemas. It also means the API is much easier to document as the possibilities are limited to its models, not its representations and dynamically managed relationships.

That being said I think programmatically the GraphQL query can easily be verified for correctness outside the program code where the the REST API let's you use auto-completion inside the editor of choice. IMHO with growing complexity of tasks or queries I'd lean towards GraphQL and it's good practice to start small, aka with less complexity.

Sorry for this long debatable answer.

.github/workflows/issue-subscriber.yml
15	That's a fair point. Sorry for not seeing that.
llvm/utils/git/github-automation.py
22	Sorry for being picky. This is not needed but a though. I'd make this a property to be immutable to change from the outside: # In __init__: # ... self._team_name = 'issue-subscribers-{}'.format(self.label_name).lower() @property def team_name(self) -> str: return self._team_name
24	To help code assists, it's nice to include the return type f this function. def run(self) -> bool:
29	Do we ignore any exceptions that can happen? This question is important for future additions.

Make team_name a property and add a return type to run()

tstellar marked 2 inline comments as done.Jan 10 2022, 9:38 PM

tstellar added inline comments.

llvm/utils/git/github-automation.py
29	That was my plan for this task. I'm not sure what we would do with the exceptions if we caught them.

Harbormaster completed remote builds in B142577: Diff 398834.Jan 10 2022, 10:48 PM

I don't see anything obvious to hold this back.

This revision is now accepted and ready to land.Jan 13 2022, 2:13 AM

Closed by commit rGa2adebf409ce: workflows: Make issue-subscriber more robust for labels with special characters (authored by tstellar). · Explain WhyJan 14 2022, 10:06 PM

This revision was automatically updated to reflect the committed changes.

tstellar added a commit: rGa2adebf409ce: workflows: Make issue-subscriber more robust for labels with special characters.

Revision Contents

Path

Size

.github/

workflows/

issue-subscriber.yml

34 lines

llvm/

utils/

git/

github-automation.py

50 lines

Diff 400239

.github/workflows/issue-subscriber.yml

	name: Issue Subscriber			name: Issue Subscriber

	on:			on:
	issues:			issues:
	types:			types:
	- labeled			- labeled

	jobs:			jobs:
	auto-subscribe:			auto-subscribe:
	runs-on: ubuntu-latest			runs-on: ubuntu-latest
	if: github.repository == 'llvm/llvm-project'			if: github.repository == 'llvm/llvm-project'
	steps:			steps:
				- name: Setup Automation Script
				run: \|
				curl -O -L https://raw.githubusercontent.com/$GITHUB_REPOSITORY/$GITHUB_SHA/llvm/utils/git/github-automation.py
				kwkUnsubmitted Not Done Reply Inline Actions I suspect that checking out the repo is not an option here? But anyways, if you absolutely need curl to download the script, then be aware of some issues with this approach. I noticed that curl sometimes queries an older version of the content than currently is on github. The solution could be this: curl \ --compressed \ -s \ -H 'Cache-Control: no-cache' \ https://raw.githubusercontent.com/$GITHUB_REPOSITORY/$GITHUB_SHA/llvm/utils/git/github-automation.py?$(uuidgen) You might wonder about the `--compressed` and caching options or even about the UUID being attached to the URL at the very end. These are all ways to ensure we get the freshest of all versions of the file on github. I got the inspiration for this from here: https://stackoverflow.com/questions/31653271/how-to-call-curl-without-using-server-side-cache?noredirect=1&lq=1 kwk: I suspect that checking out the repo is not an option here? But anyways, if you absolutely…
				tstellarAuthorUnsubmitted Done Reply Inline Actions I don't think any older content exists, because we are pulling a specific version of the file : GITHUB_SHA. tstellar: I don't think any older content exists, because we are pulling a specific version of the file…
				kwkUnsubmitted Not Done Reply Inline Actions That's a fair point. Sorry for not seeing that. kwk: That's a fair point. Sorry for not seeing that.
				chmod a+x github-automation.py
				pip install PyGithub

	- name: Update watchers			- name: Update watchers
	uses: actions/github-script@v5			run: \|
	with:			./github-automation.py \
	github-token: ${{ secrets.ISSUE_MENTION_SECRET }}			--token ${{ secrets.ISSUE_SUBSCRIBER_TOKEN }} \
	script: \|			issue-subscriber \
	const teamname = "issue-subscribers-" + context.payload.label.name.replace(/ /g, "-").replace(":","-").replace("/","-");			--issue-number ${{ github.event.issue.number }} \
	const comment = "@llvm/" + teamname;			--label-name ${{ github.event.label.name }}
	try {
	// This will throw an exception if the team does not exist and no
	// comment will be created.
	team = await github.rest.teams.getByName({
	org: context.repo.owner,
	team_slug: teamname
	});
	github.rest.issues.createComment({
	issue_number: context.issue.number,
	owner: context.repo.owner,
	repo: context.repo.repo,
	body: comment
	});
	} catch (e){
	console.log(e);
	}

llvm/utils/git/github-automation.py

This file was added.

Property	Old Value	New Value
File Mode	null	100755

				#!/usr/bin/env python3
				#
				# ======- github-automation - LLVM GitHub Automation Routines--- python ---==#
				#
				# Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				# See https://llvm.org/LICENSE.txt for license information.
				# SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				#
				# ==-------------------------------------------------------------------------==#

				import argparse
				import github
				import os

				class IssueSubscriber:

				@property
				def team_name(self) -> str:
				return self._team_name

				kwkUnsubmitted Done Reply Inline Actions Being picky here, although this looks straight forward, I highly suggest that we use typed python: def __init__(self, token:str, repo:str, issue_number:int, label_name:str): This makes functions so much more readable IMHO. kwk: Being picky here, although this looks straight forward, I highly suggest that we use typed…
				def __init__(self, token:str, repo:str, issue_number:int, label_name:str):
				self.repo = github.Github(token).get_repo(repo)
				kwkUnsubmitted Done Reply Inline Actions Sorry for being picky. This is not needed but a though. I'd make this a property to be immutable to change from the outside: # In __init__: # ... self._team_name = 'issue-subscribers-{}'.format(self.label_name).lower() @property def team_name(self) -> str: return self._team_name kwk: Sorry for being picky. This is not needed but a though. I'd make this a property to be…
				self.org = github.Github(token).get_organization(self.repo.organization.login)
				self.issue = self.repo.get_issue(issue_number)
				kwkUnsubmitted Done Reply Inline Actions To help code assists, it's nice to include the return type f this function. def run(self) -> bool: kwk: To help code assists, it's nice to include the return type f this function. ```lang=python def…
				self._team_name = 'issue-subscribers-{}'.format(label_name).lower()

				def run(self) -> bool:
				for team in self.org.get_teams():
				if self.team_name != team.name.lower():
				kwkUnsubmitted Not Done Reply Inline Actions Do we ignore any exceptions that can happen? This question is important for future additions. kwk: Do we ignore any exceptions that can happen? This question is important for future additions.
				tstellarAuthorUnsubmitted Done Reply Inline Actions That was my plan for this task. I'm not sure what we would do with the exceptions if we caught them. tstellar: That was my plan for this task. I'm not sure what we would do with the exceptions if we caught…
				continue
				comment = '@llvm/{}'.format(team.slug)
				self.issue.create_comment(comment)
				return True
				return False


				parser = argparse.ArgumentParser()
				parser.add_argument('--token', type=str, required=True)
				parser.add_argument('--repo', type=str, default=os.getenv('GITHUB_REPOSITORY', 'llvm/llvm-project'))
				subparsers = parser.add_subparsers(dest='command')
				kwkUnsubmitted Done Reply Inline Actions `os.getenv()` is overloaded and accepts a second value to be the default if the environment variable is `None`. def get_default_repo(): return os.getenv('GITHUB_REPOSITORY', 'llvm/llvm-project') kwk: `os.getenv()` is overloaded and accepts a second value to be the default if the environment…

				issue_subscriber_parser = subparsers.add_parser('issue-subscriber')
				issue_subscriber_parser.add_argument('--label-name', type=str, required=True)
				issue_subscriber_parser.add_argument('--issue-number', type=int, required=True)

				args = parser.parse_args()

				if args.command == 'issue-subscriber':
				issue_subscriber = IssueSubscriber(args.token, args.repo, args.issue_number, args.label_name)
				issue_subscriber.run()