5 months ago

United Kingdom

## A python solution

### Edit II: gained 8 bytes (now 108):

**NB** This doesn’t solve the problem completely correctly as present (was written before OP clarified).

```
from itertools import product as p
print(["".join(s) for s in p(*[i+{'C':'t','G':'a'}.get(i,'') for i in ""])])
```

~~Edit: a little enhancement at the cost of readability:~~

~~
~~As a pseudo-oneliner, without counting the input string as bytes:

```
from itertools import product
print(["".join(s) for s in product(*[i + {'C':'t'}.get(i, '') for i in ""])])
```

~~112 bytes, or 116, if the dictionary should read ~~`{'C':'t','G':'a'}`

.

Not 100% sure if this is correct, but I get the same output as Pierre:

```
from itertools import product
s = [i + {'C':'t'}.get(i, '') for i in "ATGCCCATG"]
print(["".join(sub) for sub in product(*s)])
```

### 127 bytes.

```
['ATGCCCATG', 'ATGCCtATG', 'ATGCtCATG', 'ATGCttATG', 'ATGtCCATG', 'ATGtCtATG', 'ATGttCATG', 'ATGtttATG']
```

If the G substitution is also needed (OP was unclear):

```
from itertools import product
s = [i + {'C':'t','G':'a'}.get(i, '') for i in "ATGCCCATG"]
print(["".join(sub) for sub in product(*s)])
```

I think I get the same as zx8754:

```
['ATGCCCATG', 'ATGCCCATa', 'ATGCCtATG', 'ATGCCtATa', 'ATGCtCATG', 'ATGCtCATa', 'ATGCttATG', 'ATGCttATa', 'ATGtCCATG', 'ATGtCCATa', 'ATGtCtATG', 'ATGtCtATa', 'ATGttCATG', 'ATGttCATa', 'ATGtttATG', 'ATGtttATa', 'ATaCCCATG', 'ATaCCCATa', 'ATaCCtATG', 'ATaCCtATa', 'ATaCtCATG', 'ATaCtCATa', 'ATaCttATG', 'ATaCttATa', 'ATatCCATG', 'ATatCCATa', 'ATatCtATG', 'ATatCtATa', 'ATattCATG', 'ATattCATa', 'ATatttATG', 'ATatttATa']
```

@Chris, it would be useful if you had some 100% correct example output?

So that we can test our outputs, what is the output for

`ACTGN`

?~~after I saw the other's answers, i'm not sure if we only need to change the 'C' in the sequence or the 'C' AND the 'G'.~~Both need to change since the methylation and read you are looking at could be from either strand.

If I am understanding the problem correctly, you can't change C->T and G->A for the same read.

The only conversion chemically occurring is C->T, so there are 3 conversions possible for the upper strand, and two for the lower strand.

~~If my interpretation is correct, Pierre Lindenbaum and jrj.healey solutions are incomplete, and zx8754 solution is incorrect.~~edit:my interpretation was right, but all answers have been corrected and updated.This is correct, it's either C->T or G->A, but not both!

OK, R code corrected, gives same output as expected.

Updated this post with more details and example input/output. Sorry for the confusion!